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METHOD OF CONTROLLING INSECTS 

FIELD OF TRE INVENTION 
This invention relates to a method of controlling lepidopteran 
insects by use of an en2ynie which may be applied directly to the plant 
5 or produced thereon by microorganisms or by genetically modifying the 
plant to produce the enzyme, and to genes, microorganisms, and plants 
usefiil in that method. 



BACKGROUND OF THE INVENTTON 
10 The use of natural products, including proteins, is a well known 

method of controlling many insect pests. For example, endotoxins of 
Bacillus thuringiensis (B.t) are used to control both lepidopteran and 
coleopteran insect pests. Genes producing these endotoxins have been 
introduced into and expressed by various plants, including cotton, 

15 tobacco, and tomato. There are, however, several economically 
important insect pests that are not susceptible to B.t endotoxins. 
There is also a need for additional proteins which control insects for 
which B.t. provides control in order to manage any development of 
resistance in the population. 

20 It is therefore an object of the present invention to provide 

proteins capable of controlling lepidopteran insects and genes useful in 
producing such proteins. It is a further object of the present invention to 
provide genetic constructs for and methods of inserting such genetic 
material into microorganisms and plant cells. It is another object of the 

25 present invention to provide transformed microorganisms and plants 
containing such genetic material. 



SUMMARY OF THE INVENTTnN 
It has been discovered that proteins that catal3rze the oxidation of 
30 3-hydroxysteroids, for example, cholesterol, will control lepidopteran 
insects. They cause mortality and stuntingof larvae of lepidopteran 
insects. The enzymes may be applied directly to plants or introduced in 
other ways such as through the application of plant-colonizing micro- 
organisms or by the plants themselves, which have been transformed to 
35 produce the enzymes. 
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3-Hydroxysteroid oxidases (E.C.1.1.3.6) are naturally produced by 
microorgaiusixis such as Streptomyces sp., Pseudomonas sp., 
Mycobacterium sp,, Schizophyllum commune, Nocardia sp., and 
Rhodococcus sp. [Smith et al., 1976, and Long et al., 1990.]. 
5 Preparations of enz3maes from several different sources are available 
from Sigma Chemical Company, St. Louis, Missouri. 

New Streptomyces genes that control the expression of 
3-hydroxysteroid oxidase have been isolated and sequenced. These new 
genes or genes from other known producers of S-hydroxjrsteroid oxidase 
10 may be inserted into a transformation vector cassette which is used to 
transform plant-colonizing microorganisms which when applied to 
plants express the genes producing a 3-hydro3Qrsteroid oxidase, thereby 
providing control of insects. Alternatively, genes which function in 
plants and encode the subject enzymes may be inserted into 
15 transformation vector cassettes which may be incorporated into the 
genome of the plant, which then protects itself from attack by 
expressing the gene and producing a S-hydrosysteroid oxidase. 
Additionally, the plant may also be transformed to co-express BJ. genes 
which express proteins for the control of other insects. Examples of 
20 plants transformed to express B.t genes are disclosed in European 
Patent PubUcation No. 0 385 962 (Fischhofif et aL). 

In accomplishing the foregoing, there is provided, in accordance 
with one aspect of the present invention, a method of controlling insect 
infestation of plants comprising providing a S-hydroxysteroid oxidase for 
25 ingestion by the insect 

In accordance with another aspect of the present invention, there 
is provided a method of producing genetically transformed plants which 
express an amount of a 3-hydroxysteroid oxidase effective to control 
lepidioteran insects, comprising the steps of: 
30 a) insertingintothegenomeof a plant cell a recombinant, double- 
stranded DNA molecule comprising 

(i) a promoter which functions in plant cells to cause the 
production of an RNA sequence; 

(ii) a structural coding sequence that encodes for 
35 3-hydrox3rsteroid oxidase; and 
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(iii) a 3' non-translated region which functions in said plant 
cells to cause the addition of polyadenylate nucleotides to 
the 3* end of the RNA sequence, 

wherein said promoter is heterologous with respect to the 
5 structural coding sequence and wherein said promoter is 

operatively linked with said structural coding sequence, which is 

in turn operably linked with said non-translated region; 

b) obtaining transformed plant cells; and 

c) regenerating from the transformed plant cells genetically 
10 transformed plants which express an insectiddally effective 

amount of sterol oxidase. 

There is also provided, in accordance with another aspect of the 
present invention, bacterial and transformed plant cells that contain 
DNA comprised of the above-mentioned elements (i), (ii), and (iii). 

15 As used herein, the term '^controlling insect infestation** means 

reducing the number of insects whidi cause reduced yield, either throui^ 
mortality, retardation of larval development (stunting), or reduced repro- 
ductive e£5ciency. 

As used herein, the term ''structural coding sequence** means a 

20 DNA sequence which encodes for a polypeptide, which may be made 
a cell following transcription of the DNA to mBNA, followed by 
translation to the desired polypeptide. 



DETAILED DESCRIPTION OF THE INVENTIQN 
25 3-Hydro^teroid oxidases catalyze the oxidation of the 

3-hydroxy group of S-hydro^rsterdds to produce ketosteroids and 
hydrogen peroxide. They are capable of catal3rzing the oxidation of 
various 3-hydroxysteroids, such as, for example, cholesteroL Modt of the 
previously known 3-hydro^rsteroid oxidases are called "cholesterol 
30 oxidases** (enzymatically catalogued as E.G. #1.1.3.6) but cholesterol is 
only one of the S-hydrosgrsteroid substrates. The use of all 3-h3rdro(^- 
steroid oxidases and the genes encoding such proteins, for the purpose of 
controlling insects, are within the scope of the present invention. . 

3-Hydro^steroid oxidases are commercially available for use as 
35 reagents for senim cholesterol assays. For example, Sigma (Chemical 
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Company, St. Louis, Missouri, offers three such 3-hydroxysteroid 
oxidases (denominated as cholesterol oxidases), one from a Streptomyces 
sp., one from a Pseudomonas fluorescens, and one from a Brevibacterium. 
Two other soxnxes of 3-hydroxysteroid oxidase, two streptomycetes 
5 denominated A19241 and A19249, each of which produce a 3-hydro^- 
steroid oxidase, have been isolated. The organisms were collected in 
Madagascar. When these organisms were cultured according to tisual 
methods the culture filtrates were found to affect insect larvae as 
described below. 

10 ^ A seed culture of A19249 was started in 55 mL sterile Tryptone- 
Yeast Extract broth, pH 6.8, in a 250 mL shake flask* The seed was 
agitated at 250 rpm on a rotary shaker for 3 days at 30 A New 
Brunswick Bioflo n Bioreactor with a 2 L working volimie was filled with 
"medium 202" (MgS04»7H20 2 g/L, KH2PO4 0.5 gO., NaCl 0.5 g/L, 

15 CaCOa 1 g/L, ZnS04*H20 (1 mg/mL stock) 5 mL^, 100 mM FeEDTA 
0.5 mL/L, Soluble Starch 5 g/L, Dextrose 2.5 g/L, Malt Extract 2.5 g/L, 
Soytone5^). The pH was adjusted to 6.5 with 2.5 N NaOH or 1 N 
HCl, and 1 wJJL of P2000, an antifoam agent was added. The 
bioreactor was sealed and autodaved for 25 min at 250 ''C. The seed, at 

20 3 days growth, was used to inoculate the fermentor at 2% or 40 

The fermentation took place at 30 with an airflow of 1 1/min and 
agitation running at 500 rpm. The fermentation was harvested after 40 
h. 

Each of these enzymes has demonstrated control of insects as 
25 shown below. The P. fluorescens 3-hydrt^ysteroid oxidase is immunologi- 
cally distinct from the Streptomyces enzymes, but it also controls 
insects. 

Other organisms producing 3-hydrox3rsteroid oxidases of the 
present invention may be identified by assaying culture filtrates or 
30 individual proteins for S-hydroxysteroid oxidase activity using a spectro- 
photometric assay, described below, which measiu-es hydrogen peroxide 
production in the presence of a 3-hydro3cysteroid, for example, 
cholesterol [Gallo, 1981]. 
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BIOEFFICACY ASSAYS 

Larsral Bioassav 

Lepidopteran larvae were tested on artificial agar-based diet 
5 (Marrone et al., 1985) treated with the indicated amooint of the A19249 
3-hydroxysteroid oxidase (cholesterol oxidase) for six days. The results 
are shown in Table 1. Percent stunting is defined as 
larval weights rHepes control) > larval weights (cholesterol oYidflSA^ 
xlOO% 

10 larval weights (Hepes control) 

An extended test was performed with tobacco budworm larvae to 
test the efiect of the stvmting noted in the six-day test. Tobacco 
budworm eggs were added to artificial diet (as described above) 
containing either buffer or 100 ppm A19249 S-hydroxysteroid oxidase 

15 (cholesterol oxidase). After seven days, some mortality as compared to 
the controls was noted. Surviving larvae were moved to firesh diet 
(control or treated, as appropriate) every seven days. Percent mortality 
(corrected for control mortality) is reported for the 7, 10, and 14 day 
periods in Table 2. The corrected number oflarvae was 23. By the 27th 

20 day only two larvae were alive; both were very stunted and never 

pupated. 92% of the control larvae survived and all survivors pupated 
by day 27. This experiment demonstrates that 3-hydroxysteroid oxidase 
essentially arrests the development of tobacco budworm larvae. 



25 












Insect 


Stage 


nn.QP.[ip/mT. 






tobacco budworm 


egglv 


30 


0 






Iv 


100 


86% 




corn earwonn 


hr 


50 


0 


30 




Iv 


100 


35% 




&11 army worm 


hr 


30 


0 




tobacco homworm 


Iv 


30 


0 






Iv 


100 


30% 




pinkboUworm 


Iv 


50 


0 


35 




Iv 


100 


30% 
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6 



European comborer Iv 50 

Iv 100 

beet armj^onn Iv 100 

black cutworm Iv 100 



0 

46% 
76% 
68% 



10 



Interval (dava) 
7 

10 
14 



Percent MortaUtv 
20 
61 
80 



MODE OF ACTION STUDIES 
The following studies show that S-hydroxsrsteroid oxidase has a 
direct effect on the insect itself and that the activity demonstrated in 

15 the experiments described above caimot be attributed to the enzymes 
effect on the insect's diet, for example by sterol depletion. BoU weevil 
and lepidopteran larvae are most susceptible to the enzyme. It is 
believed that this specificity is due to the effect of 3-hydroxysteroid 
oxidase on the midgut of the insect as explained in more detail below. 

20 Other insects with similar midgut physiologies may also be controlled by 
S-hydroxjrsteroid oxidase. In addition, 3-hydro3^steroid oxidases other 
than those tested and reported herein may control a different spectrum 
of insects with different midgut physiologies. 



25 CpttQ^ gged Piet A^say 

Treatment diet was made by mixing 30 g of Pharmamedia™ 
(Traders Protein) cottonseed flour into 170 mL of a 1.6% agar solution at 
50 ^C, containing 0.13% propionic add, 0.014% phosphoric add, and 30 
mg each of streptomycin sulfate and chlor-tetracycline. Before mixing, 

30 10% KOH was xised to adjust the pH to 6.2. PharmamediaTM is a floiir 
made up of cottonseed embryos. The diet was incubated in a water bath 
at 40 °C. Dilutions of the Sigma Streptomyces 3-hydrox3^teroid oxidase 
(cholesterol oxidase) were incorporated into the diets. Tobacco budworm 
larvae are 68% stunted when exposed to 3-hydroxysteroid oxidase (100 

3 5 ppm) in cottonseed diet 
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Cotton Callus Diet Assay 

Cotton callus assays were conducted using Coker 312 cotton 
callus and a 96 well insect diet tray. Each well contained 0.5 ml of gelled 
5 2% agar (containing 0.13% propionic add and 0.014% phosphoric add) 
that was covered with a filter paper disc. A piece of callus 
(approximately 150 mg) was placed in each well. For each replicate, 25 
^1 of a stock solution of S-hydros^steroid oxidase (1.25 m^ml stock) or 
25 ^1 of 25 mM Hepes buffer (pH 7.5) was pipetted onto each piece of 

10 callus. Tobacco budworm eggs (4-8 eggs/weU, 0-12 hrfirom hatching) in 
0.15% agar were pipetted onto the filter disc next to the callus. 
Individual larvae were transferred to fireshly treated callus samples 
every three days. Larvae were weighed after 10 days. 

Stunting of larvae was observed in the treated wells, althoxigh 

15 reduced as compared to the artificial diet or the cottonseed diet. This 
reduced level of stunting with cotton callus is probably due to the 
pipetted enzyme solution not completely covering the callus tissue and 
the loss of some enzyme onto the filter papen 

20 Pre-ingggt^ftn Eflfert on Diet 

There are no lethal or stunting effects irom feeding larvae a diet 
sample that was incubated for one week with a 3-hydroxysteroid oxidase 
and then heated to 80 K!! to denature the enzyme prior to using it in the 
above assay. This further demonstrates that the mode of action of 

25 3-hydroxysteroid oxidase is not dependent on sterol depletion of the 
nutrient source but that the enzyme is directly active upon the insect. 

MicroacopvStiidiftfl 

Tobacco budworm larvae were reared firom eggs on artificial diet 

30 containing 100 ^ig^ml of 3-hydroxysteroid oxidase or Hepes buffer 

control. After ten days, midgut epithelia were dissected and processed 
for microscopy as described by Purcell et al., 1993. The midguts fix>m 
treated larvae exhibited a variety of histological and ultrastructural 
alterations as compared with Hepes buffer-treated controls. 3-Hydro^- 

35 steroid oxidase induced a lateral constriction and apical-basal elongation 
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of the cells in the midgut epitheHxim, resulting in a widening of the inter- 
cellular spaces and the formation of an irregular and convoluted lumenal 
surface. The cellular attenuation was also characterized by localized 
apical cjrtoplasiDic blebbing and microvillar denudation. This denudation 
5 occasionally generated detached cytoplasmic Augments in the lumen. 
Lfocalized cj^lysis was observed, but the epithelium as a whole 
remained intact. These observations suggest that exposure to 3« 
hydrox3rsteroid oxidase compromises the integrity of the apical plasma 
membrane which results in a secondary generalized cellxilar 

10 hypersensitivity to osmotic swelling and lysis. 

Cholesterol is required for the integrity and normal function of 
virtually all cellular membranes. A reduction in the availability of 
cholesterol for incorporation into and maintenance of cellular 
membranes due to metaboHsm by 3-hydroxysteroid oxidase could there- 

15 fore be one possible mechanism for toxicity of this agent However^ the 
observations described above show the most dramatic structural 
alterations are localized to the tips of the midgut cells in contact with 
the 3-hydrQi^steroid oxidase in the gut lumen. This suggests that 3- 
hydro^steroid oxidase may act directly to alter cholesterol already 

20 incorporated into the membrane. In addition, diet treatment studies 
described above demonstrated that the diet was not altered prior to 
ingestion in a manner which compromises tobacco budworm growth, 
again suggesting a direct effect of the enzyme on the insect. 

25 Mode of Act ion Theory 

While not being bound by this theory, it is beUeved that the 3- 
hydroxysteroid oxidase enzyme stunts the growth of lepidopteran larvae 
by some action in the gut after ingestion. The bioassay and 
morphological data show that there was clearly a pathological condition 

3 0 restdting from ingestion of diet containing 3-hydroxysteroid oxidase. 

ENZYME roENTIFICATION 

The active proteins from the Madagascar Streptomyces micro- 
organisms were isolated, purified, partially sequenced, and identified as 
35 3-hydrQ^steroid oxidases. 
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Protein Tsnlflfinn 

Each culture filtrate was purified by first sizing on YMIO 
membranes (Amicon) to a [>10 kDa] firaction, followed by multiple 
chromatography runs on an FPLC Mono Q HRlO/10 (Pharmacia LKB, 
5 Piscataway, NJ) column. For chromatography on the Mono Q column, 
the samples were loaded on the column in 25 mM Hepes pH 7.5 and 
eluted with a gradient to 1.0 M KCl in 25 mM Hepes pH 7,5. Fractions 
were collected and aliquots of each were filtered through 0.2 \i Acrodisc 
syringe tip filters. Each was tested in an insect assay. Aliquots of 
10 inseddddally active firactions were electrophoresed on SDS-PAGE 

[Laemmli, 1970] using a Daiichi Double Gel Device and 10-20% mini-geL 
Proteins were visualized by silver staining using Daiichi silver stain 
reagent kit. The active enzymes of the present invention, isolated fi:t>m 
the novel microorganisms, were found to be a 52.5 kDa protein. 

15 

Amino Arid Sequences 

An SDS-PAGE gel of the protein produced by Streptomyces 
A19241, isolated as above, was blotted onto PVDF paper (Immobilon, 
Millipore) using the protocol of Matsudaira [Matsudaira, 1987]. The 

20 N-terminus was sequenced using automated Edman degradation 

chemistry. A gas phase sequencer (Applied Biosystems, Inc.) was used 
for the degradation using the standard sequencer cycle. The respective 
PTH-aa derivatives were identified by reverse phase HPLC analysis in 
an on-line fashion employing a PTH analyzer (Applied Biosystems, Inc.) 

25 fitted with a Brownlee 2.1 mm i.d.PTH-C18 column. For internal 
sequences, digestions were carried out on purified 3-hydroxysteroid 
oxidase fi^m A19249 using trypsin (TPCJK-treated, fixmi Worthington 
Biochemicals Corp., Freehold, NJ). Fragments were then purified by 
reverse phase HPLC and sequenced in an N-terminal fashion. 

30 The resulting partial sequences were compared to Imown proteins 

and a strong (71%) homology was found with the reported fourteen 
amino add sequence at the N-terminus of a 3-hydrox3^steroid oxidase 
isolated firom a Streptomyces species [Ishizaki et al., 1989]. The reported 
enzyme has an Mr of 54.9 kDa which agrees well with the Mr of 52.5 

35 kDa of the isolated enzyme. 
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Six internal firagments of the purified enzyme firom A19249, also 
having homology to six regions of the reported enzyme, were sequenced. 
The fragments had 95, 76, 64, 58, 89, and 100 percent sequence 
identities. 

5 

Ainipn Arir^ noTTi positjon Det ermination and Comparison 

The amino add composition of the S-hydroxysteroid oxidase 
produced by A19249 was determined and compared with the 
composition of the reported Streptomyces enzyme. The samples were 

10 subjected to acid hydrolysis (6N HCl, vapor phase hydrolysis using a 
Water's Picotag workstation, 24 hr, 110 *'C.)- All analyses were 
performed after post-column derivation of the hydrolysates using 
ninhydrin [Moore et aL, 1963]. A Beckman Model 6300 Auto analyzer 
was employed for the actual determinations. The S delta n/N statistic is 

15 used to compare two compositions in order to make a prediction about 
their relatedness. The formula for the statistic is: 

1^2 1 (nAi - nBi)2/N 
where A is one composition, B is the other composition, i is each amino 
add, n is the number of each amino add, N is the total number of amino 

20 adds in the protein. IfS delta n/N is <0.42, then there is a greater than 
95% chance that the proteins are related. The smaller the value, the 
more closely the determined compositions match. 

The S delta n/N statistic for the A19249 protein compared to the 
reported enzyme is 0.36, indicating that the two are highly related. 

25 

S-Hvdroxvsteroid Oxidase Assav 

The identity of the enzyme was confirmed by testing its abiUty to 
oxidize a 3-h3rdroxysteroid, specifically cholesterol. The enzyme is added 
to a reagent mixture comprising horseradish peroxidase (20 U/mL), 
30 phenol ( 14 mM), 4-amino antipyrine (0.82 mM), Triton® X-lOO (0.05%) 
and phosphate buffer (100 mM, pH 7). The sterol in isopropanol is then 
added and the absorfoance at 500 nm monitored. One unit of activity is 
defined as the amount of en^mie required to oxidize 1 pmole of sterol per 
minute at 20 °C. 

35 The activity levels of the enzymes are reported in Table 12 for 3* 
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hjrdroxysteroids representative of various classes of S-hydnu^steroids. 
The enzyme soxirces are as follows: 

1 = A19249 

2 = A19241 

5 3 = Sigma Streptomyces 

4 s Sigma Pseudomonas 

Me 3 
Relative Rate for En^vmfta 





Sterol 


X 


2 


3 


4 


10 


cholesterol 


100 


100 


100 


100 




dihydrocholestorol 


56 


56 


59 


69 




dehydrocholesterol 


13 


12 


7 


47 




lathosterol 


28 


34 


27 


71 




stigmasterol 


22 


28 


11 


21 


15 


sitosterol 


88 


65 


49 


50 




campesterol* 


65 


64 


45 


49 




fiicosterol 


22 


20 


12 


68 




lanosterol 


<1 


<1 


<1 


1 




ecdysone 


<1 


<1 


<1 


<1 


20 


20-OH ecdysone 


<1 


<1 


<1 


<1 



"^65/35 mixture of campesterol and dihydrobrassicasterol 

TTT^TTin^nlnfnrnl Comparison of Enzvmes 

The Sigma Streptomyces enzyme is immunologically related to the 

25 3-hydroxysteroid oxidases produced by the isolates of the present 
invention, numbers A19241 and 19249, as demonstrated by Western 
blotting [Bumette et aL, 1981] using polyclonal antisera generated 
against the Sigma Streptomyces enzyme. The antisera recognized both 
enzjmies produced by the isolates. The S-hydroxjrsteroid oxidase from P. 

30 /Zuore5ce7i5 was not recognized by the antisera* This demonstrates that 
immunologically distinct 3-hydro^rsteroid oxidases are toxic to insects. 

GENETIC IDENTIFICATION 

The 3-hydroxysteroid oxidase gene was isolated from one of the 
35 Streptomyces microorganisms isolated in Madagascar and its sequence 



wo 95/01098 



PCT/US94/07252 



12 

determined 

Cloning of the 3-Hvdroxvsteroid Oxidase Gene from A19249 

As discussed above, peptide sequences of purified 3-hydroxysteroid 
5 oxidase fi^m A19249 were obtained for four regions of the protein. These 
peptide sequences were compared to a database of known protein 
sequences, and this comparison revealed that the A19249 protein showed 
a high degree of homology to a known 3-hydroxysteroid oxidase from 
Streptomyces [Ishizaki]. Comparing the A19249 peptide sequences to this 

10 known protein sequence, these peptides were assigned to their likely 
positions in the A19249 protein sequence. The sequence derived from the 
intact 3-hydro:Qrsteroid oxidase from A19249 corresponded to a region 
near the N-terminus of the secreted form of the enzyme from the ptib- 
lished sequence. From this it was concluded that the A19249 N-terminal 

15 peptide sequence was also likely to correspond to a "mature** secreted 
form of the protein lacking its putative secretoiy signal sequence* This 
was later confirmed by the DNA sequence analysis of the A19249 gene 
(see below). Three peptides were used to construct hybridization probes 
for isolation of the A19249 3-hydroxysteroid oxidase gene. Peptide N2 

20 (SEQ ID N0:1) corresponded to N-terminal amino adds 29 - 43 of the 
known mature protein sequence (sequence without the putative signal 
peptide); peptide Cl (SEQ ID N0:2) to amino adds 434 - 449 of the 
mature protein sequence; and peptide C2 (SEQ ID N0:3) to amino adds 
464 - 475 of the mature protein sequence. 

25 N2(SEQIDN0:1): 

ValSerThrLeuMetLeuGluMetGlyGlnLeuTrpAsnGhi^ 
C1(SEQIDN0:2): 

AlaPheAlaAspAspPheCysTyrHisProLeuGlyGlyCysValLeu 
C2 (SEQ ID N0:3): AsnLeuTyrValThrAspGlySerLeuDeProGly 

30 Based on these peptide sequences, three long nondegenerate 

oligonudeotides, corresponding to 3-hydroxysteroid oxidase peptide 
sequences from A19249, were designed using Streptomyces preferred 
codons. The oligonudeotides N2 (SEQ ID N0:4), Cl (SEQ ID N0:5), and 
C2 (SEQ ID N0:6) correspond to the peptides N2, Cl, and C2 described 
35 above. 
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N2 Probe (SEQ ID N0:4): 

gtgtccaccctgatgctggagatgggccagctgtggaaccagccc 
CI Probe (SEQ ID N0:5): 

gccttcgccgacgacttctgctaccacccgctcggcggctgcgtcctg 
5 C2 Probe (SEQ ED N0:6): aacctctacgtgaccgacggttcgctgatcccgggt 

Probes N2 (SEQ ID N0:4), CI (SEQ ID N0:5), and C2 (SEQ ID 
N0:6) were all used as hybridization probes on Southern blots of A19249 
genomic DNA. All three probes hybridized to the same 2.2 kb band in 
Bamm digested DNA, but N2 (SEQ ID N0:4) hybridized to a different 
10 fragment than CI (SEQ ID N0:5) and C2 (SEQ ID N0:6) did in Sail and 
Bg^Idigests. This indicated that SaU and cut within the mding 
sequence of the 3-hydro^teroid oxidase gene from A19249, which was 
confirmed by DNA sequence analysis. 

The S-hydrozysteroid oxidase gene from A19249 was isolated using 
15 the three synthetic oligonucleotides as hybridization probes on a library of 
DNA fragments of A19249 DNA in a lambda phage vector. A libraxy was 
made in lambda EMBL3 using partial-digest Mbol DNA fragments of 
A19249. These probes were used to screen approximately 72,000 lambda 
phage plaques from the primary library. Primary plaque screening was 
20 performed using N2 (SEQ ID N0:4) plus C2 (SEQ ID N0:6). A total of 12 
recombinant plaques that hybridized to the N and Oterminal probes were 
picked and purified by a second round of hjHbridization screening with 
probes N2 (SEQ ID N0:4) and C2 (SEQ ID N0:6). Southern blot analysis 
revealed that, in five of six lambda clones analyzed, a 2.2 kb BamHI frag- 
25 ment hybridized to both the N and Oterminal probes. This result con- 
firmed the earlier Southern hybridization analysis that indicated a 2.2 kb 
B amHI fragment contained the 3-hydroxysteroid oxidase gene. This 2.2 
kb DNA firagment was cloned into plasmid vector pUC18 lyanisch-Perron 
et al., 1985] in both orientations for further analysis. Restriction mapping 
30 showed that there were internal Sail and Bglll sites as predicted by the 
Southern hybridization analysis. These sites are also conserved com- 
pared to the published 3-hydroxysteroid oxidase gene sequence. The 
BamHI fragment was fiirther subcloned into four firagments for direct 
DNA sequencing. 

35 
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Seouence Analysis of the 3-Hvdroxvsteroid Oxidase Gene 

A total of 1865 nudeotides of DNA sequence from the 2.2 kb 
. BamHI fragment were determined by direct DNA sequence analysis of 
subclones of this fragment using the dideoxy chain termination method. 
5 This sequence is identified as SEQ ID N0:7. This DNA sequence contains 
noncoding flanking regions at both the 3* and 5' ends. Analysis of this 
DNA sequence revealed a single long open reading frame that encodes a 
secretory signal peptide and the mature S-hydro^steroid oxidase protein 
of 43 and 504 amino adds, respectively. It is 84.37% identical to the 

10 published 3-hydroxysteroid oxidase nudeotide sequence. The derived 
amino add sequence is 81.685% identical to the published 3-hydro^- 
steroid oxidase sequence. It is identified as SEQ ID NO: 8. Examination of 
the A19249 DNA sequence and comparison to the N-tenninal amino add 
sequence of intact 3-hydroxysteroid oxidase from A19249 revealed that 

15 the A19249 gene encoded a protein that indudes a signal peptide 

sequence, which is apparently deaved during secretion of the protein from 
the cells. Thus the N-terminus of the mature protein firom A19249 begins 
with Ser-Gly-Gly-Thr-Phe, identified as SEQ ID N0:12. 

20 GENETIC TRANSFORMATION 

A 3-hydroxysteroid oxidase gene can be isolated from novel organ- 
isms or may be obtained from known sources, such as the Rhodococcus sp. 
described by Long et al., in WO 90 05,788. This gene may then be used to 
transform bacterial cells or plant cells to enable the production of 3- 

25 hydroxysteroid oxidase and carry out methods of this invention. Examples 
of how this may be done with the gene of A19249 are given below. 

Mtttegey^gas of the A19249 Gem 

In order to incorporate the A19249 gene into vectors appropriate 

30 for expression of the 3-hydro:7steroid oxidase in heterologous bacterial or 
plant hosts, it was necessary to introduce appropriate restriction sites 
near the ends of the gene. The goals of this mutagenesis were to create 
cassettes that included the protein coding sequence with minimal non- 
coding flflnlringr sequences and to incorporate useful restriction sites to 

35 mobilize these cassettes. Cassettes were designed that would allow mobili- 
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zation of the intact coding sequence including the signal peptide or just the 
mature coding sequence. To incorporate these cassettes into appropriate 
bacterial or plant expression vectors, an Ncol restriction site was 
engineered at the N-terminus of the intact protein sequence or at the 
5 N-terminus of the mature protein sequence. A BamHI site was 

engineered jxist after the termination codon of the intact coding sequence. 
Three mutagenesis primers were designed to create these cassettes, as 
shown below. Mutagenesis with primer Chossn (SEQ ID NO:9) substi- 
tuted three amino adds (MAT) for valine and asparagine at the N- 

10 terminus of the signal peptide of the intact protein and Chomnr (SEQ ID 
NO:10) added two amino adds (MA) at the N-terminus of the mature 
protein. This was necessary to allow incorporation of the Ncol restriction 
site. Mutagenesis with primer Cho3br (SEQ ID NO: 11) incorporated a 
BamHI site at the 3' end of the coding sequence. Primers Chomnr and 

15 Cho3br were used to direct formation of the antisense strand of DNA. 
Chossn (SEQ ID N0:9): CTCAGGAGCA££2AIfifiCGACCGCACAC 

(Ncol site imderlined) 
Chomnr (SEQ ID NO:10): GTGCCGCCGGAGGC£2AlGfiGGGCGGTGGC 
(Ncol site underlined) 

20 Cho3br (SEQ ID N0:11): 

GCCCCGCCCGTCfiS^ieCGTCAGGAACCCG (BamHI site underhned) 
The resulting modified sequences were identified as SEQ ID N0:13 
encoding for the intact protein and SEQ ID N0:14 for the ntiature protein. 

25 Emression nf S-Hvdro^a teroid Oxidase inE, coli 

The NcoI-BamHI firagments containing either the intact protein 
coding sequence or the mature protein coding sequence were inserted into 
a vector designed for protein expression in E. coli, vector pKK233-2 
(Pharmada LKB, Piscataway, NJ). pKK233-2 contains the IPTG- 

30 indudble trc promoter. The vector containing the intact (fiilllength) 
protein coding sequence as modified (SEQ ID N0:13) is designated 
pMON20909. The vector containing the mature protein coding sequence 
as modified (SEQ ID N0:14) is designated pMON20907. E. coli XLl Blue 
cells (Statagene, San Diego, CA) modified with pMON20909 expressed 3- 

35 hydroxysteroid oxidase at higher levels of enzymatic activity than cells 
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modified with pMON20g07. The protein was esctracted and purified fi*om 4 
1 of IPTGr-induced E. coli containing pMON20909. The soluble fi-action 
fi^om sonicated bacterial lysate was concen-trated and dial3rzed, and then 
partially purified by Mono Q chromatography to yield 11 units of 3-hydroxy- 
5 steroid oxidase activity. Western blot analysis indicates that the signal 
sequence of the intact protein is cleaved in E, coli^ but the exact site of 
cleavage was not determined. Analysis of the recovered protein showed a 
five-fold reduction in enzymatic activity relative to the A19249 protein, 
but the loss has not been explained by DNA sequencing which foimd no 
10 alterations that would explain loss of enzymatic activity in plant 
protoplasts or £. coli. 

The recovered protein was tested against tobacco budworm and 
resulted in 88% stunting at a dose of 100 |ig/m1. 

15 Expression of 3-Hvdroxvsteroid Oxidase in Plant Colonizing Bacteria 

To control insects, it may be desirable to express 3-hydroxysteroid 
oxidase in plant colonizing bacteria, and then apply this bacteria to the 
plant As the insect feeds on the plant, it ingests a toxic dose of S-hydroxy- 
steroid oxidase produced by the plant colonizers. Plant colonizers can be 

20 either those that inhabit the plant surface, such as Pseudomonas or Agro- 
bacterium species, or endophytes that inhabit the plant vasculature such 
as Clavibacter species. For surface colonizers, the S-hydroxysteroid 
oxidase gene may be inserted into a broad host range vector capable of 
repUcating in these Gram-negative hosts. Examples of these such vectors 

25 are pKT231 of the IncQ incompatibility group [Bagdasarian et al., 198i] 
or pVKlOO of the IncP group [Knauf, 1982]. For endophjrtes the 3-hydroxy- 
steroid oxidase gene can be inserted into the chromosome by homologoiis 
recombination or by incorporation of the gene onto an appropriate trans- 
poson capable of chromosomal insertion in these endophytic bacteria. 



Plant Gene Construction 

The expression of a plant gene which exists in double-stranded DNA 
form involves transcription of messenger UNA (mRNA) fi*om one strand 
of the DNA by RNA polymerase enzyme^ and the subsequent processing 
35 of the mRNA primazy transcript inside the nucleus. This processing 
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involves a 3' non-translated region which adds polyadenylate nucleotides 
to the 3' end of the RNA. Transcription of DNA into mRNA is regulated 
by a region of DNA usually referred to as the "promoter." The promoter 
region contains a sequence of bases that signals RNA polymerase to 
5 associate with the DNA and to initiate the transcription of mRNA using 
one of the DNA strands as a template to make a corresponding strand of 
RNA, 

A number of promoters which are active in plant cells have been 
described in the literature. Such promoters may be obtained from plants 

10 or plant viruses and include, but are not limited to, the nopaline synthase 
(NOS) and octopine sjmthase (OCS) promoters (which are carried on 
tumor-indudng plasmids of Agrobacterium tumefaciensX the cauliflower 
mosaic virus (CaMV) 19S and 35S promoters, the light-indudble 
promoter from the small subimit of ribulose 1,5-bisphosphate carbo^lase 

15 (ssRUBISCO, a very abundant plant polypeptide), and the Figwort 
Mosaic ^^rus (FMV) 35S promoter. All of these promoters have been 
used to create various types of DNA constructs which have been 
expressed in plants (see e.g., PCF publication WO 84/02913). 

The particular promoter selected shoxold be capable of causing 

20 sufficient expression of the enzyme coding sequence to result in the produc- 
tion of an effective amount of 3-hydrQxysteroid oxidase. A preferred pro- 
moter is a constitutive promoter such as FMV35S. It has been observed 
to provide more uniform expression of heterologous genes in the flowering 
portions of plants. Use of such a promoter with the 3-hydroxysteroid 

25 oxidase gene may provide greater protection of cotton bolls and squares 
from insect damage, than other promoters. 

The promoters used in the DNA constructs (i.e. chimeric plant 
genes) of the present invention may be modified, if desired, to affect their 
control characteristics. For example, the CaMV35S promoter may be 

30 ligated to the portion of the ssRUBISCO gene that represses the expres- 
sion of ssRUBISCO in the absence of lig^t, to create a promoter which is 
active in leaves but not in roots. The resulting chimeric promoter may be 
used as described herein. For purposes of this description, the phrase 
"CaMV35S" promoter thus includes variations of CaMV35S promoter, 

3 5 e.g., promoters derived by means of ligation with operator regions, random 
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or controlled mutagenesis, etc. Ftirthennore, the promoters may be 
altered to contain multiple "enhancer sequences" to assist in elevating 
gene expression. Examples of such enhancer sequences have been 
reported by Kay et al. (1987). 
5 The RNA produced by a DNA construct of the present invention 

also contains a 5* non-translated leader sequence* This sequence can be 
derived from the promoter selected to express the gene, and can be 
specifically modified so as to increase translation of the mRNA. The 5* 
non-translated regions can also be obtained from viral RNA's, fit)m 

10 suitable eukaryotic genes, or from a synthetic gene sequence. The 

present invention is not limited to constructs wherein the non-translated 
region is derived from the 5' non-translated sequence that accompames 
the promoter sequence. As shown below, a plant gene leader sequence 
which is usefiil in the present invention is the petunia heat shock protein 

15 70 (Hsp70) leader, [Winter et alj 

As noted above, the 3* non-translated r^on of the chimeric plant 
genes of the present invention contains a polyadenylation*signal which 
fimctions in plants to cause the addition of adenylate nucleotides to the 3' 
end of the KNA. Examples of preferred 3' regions are (1) the 3* tran- 

20 scribed, non-translated regions containing the polyadenylate aignal of 
Agrobacterium tumor-indudng (Ti) plasmid genes, such as the nopaline 
synthase (NOS) gene and (2) plant genes like the soybean 7s storage 
protein genes and the pea ssRUBISCO E9 gene. [Pischhoflf et alj 

25 Plant Tra nsformation and Expression 

A c him eric plant gene containing a structiural coding sequence of 
the present invention can be inserted into the genome of a plant by any 
suitable method. Suitable plant transformation vectors include those 
derived from a Ti plasmid of Agrobacterium tumefaciens, as well as those 

30 disclosed, e.g., by Herrera-Estrella (1983), Bevan (1983), Klee (1985) and 
EPO pubUcation 0 120 516 (Schilperoort et al.). In addition to plant 
transformation vectors derived from the Ti or root-inducing (Ri) plasmids 
of Agrobacteriurriy alternative methods can be used to insert the DNA 
constructs of this invention into plant cells. Such methods may involve, 

35 for example, the use of liposomes, electroporation, chemicals that increase 
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free DNA uptake, free DNA delivery via microprojectile bombardment, 
and transformation using viruses or pollen. 

A particularly useful Ti plasmid cassette vector for transformation 
of dicotyledonous plants is pMON11782. The expression cassette 
5 pMON11782 consists of the FMV35S promoter, the petxmia Hsp70 5' 
imtranslated leader, and the 3' end including polyadenylation signals from 
the pea ssRUBISCO E9 gene. Between the leader and the 3' poly- 
adenylation signals is a multilinker containing multiple restriction sites, 
including a BamHl site for the insertion of genes* pMON11782 also 
10 contains a Hindm site before the promoter sequence. 

The remainder of pMON11782 contains a segment of pBR322 
(New England Biolabs, Beverly, MA) which provides an origin of replica- 
tion in E. coli; the oriV region from the broad host range plasmid RKl 
which allows repHcation inAgrobacterium strain ABI; the streptomydn- 
15 spectinomydn resistance gene from Tn7; and a chimeric NPTII gene, 
containing the CaMV35S promoter and the nopaline synthase (NOS) 3' 
end, which provides kanamydn resistance in transformed plemt ceUs. 

Another particularly useful Ti plasmid cassette vector is 
pMON17227. This vector is described by Barry et al. in WO 92/04449 
20 and contains a gene encoding an enzyme conferring glyphosate resistance 
which is an excellent selection marker gene for many plants. 

Transient Expression of 3-Hvdroxv5teroid Oxida se in Tobacco Plants 
Both 3-hydro^steroid oxidase gene cassettes, that is the gene 
encoding intact protein with the signal sequence and that encoding only 
the mature protein, each modified at the N-terminus as described above, 
were mobilized as NcoI*BamHI fragments and inserted into a transient 
expression vector that had been cut with Ncol and BamHI. A transient 
expression vector is a simple plasmid containing a plant promoter with a 
5' nontranslated leader, a 3' nontranslated polyadenylation sequence, and 
between them a multilinker having multiple restriction sites for insertion 
of a protein coding sequence. The constructed vectors placed the 3-hydro3qr- 
steroid oxidase gene imder the control of the FMV35S promoter with the 
petunia HSP70 leader sequence discussed above. At the 3* end terminator 
region is the non-translated polyadenylation signal terminator region of 
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35 
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the nopaline synthase gene. A plasmid containing the intact protein 
coding sequence (SEQ ID N0:13) was identified and named pMON20910. 
A plasmid containing the modified mature protein coding sequence (SEQ 
ID N0:14) was identified and named pMON20908, 
5 pMON20910 and pMON20908 are vectors for expression of 

3-hydroxysteroid oxidase genes in plant cells, but these vectors lack 
appropriate sequences for use mAgrobacterium-mediBted plant transfor- 
mation. However, these vectors can be used for either transient expres- 
sion of 3-hydrox7steroid oxidase in plant cells, or they can be used to 

10 generate stably transformed cotton plants via firee DNA deliveiy such as 
biolistic bombardment of cotton meristems. 

For transient expression analysis, plasmid DNA samples firom 
pMON20908 and pMON20910 vectors were purified and introduced into 
tobacco via electroporation, Freeze-thaw extraction followed by a nine- 

15 fold concentration of soluble firactions on Centricon-10 filter concentrators 
allowed imambiguous detection of S-hydrosysteroid oxidase activity in all 
cell lysates, immunologically by Western blot assay and enz3nnatically. 
The activity of the lysate fi*om cells containing pMON20908, that is the 
coding sequence for the modified mature protein, was approximately ten- 

20 fold lower than that recovered fi'om cells containing pMON20910. 
Western blot analysis indicated that the signal sequence is cleaved in 
protoplasts, althotigh not necessarily with the fidelity necessaay to 
generate a processed protein identical in form and activity to that 
naturally secreted by Streptomyces A19249. 

25 

Stable Transformation of Dicots with a 3-Hvdroxvsteroid Oxidase fleTift 
pMON20908 and pMON20910 were used to construct vectors for 
stable transformation of dicots with Agrobacterium. Each was partially 
digested with the restriction enzyme NotI to generate DNA fi:'agments 

30 containing the FMV35S promoter, the pehmia HSP70 leader, either the 
intact [fiall length] or mature 3-hydroxysteroid oxidase coding sequence, 
and nos 3' polyadenylation site. Partial digestion was necessary because 
in addition to NotI sites flanking the expression cassette, there are two 
NotI sites within the cholesterol oxidase gene - one within the coding region 

35 for the signal peptide and one within the coding region for the mature pro- 
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teiiL The NotI partial digest fragments containing the entire expression 
cassettes are isolated using agarose gel electrophoresis and purified by 
extraction from the agarose. These NotI fi-agments are inserted into 
Notl-digested plasmid pMON17227, which is described by Barry et al. in 
5 WO 92/04449. pMON209 13 was identified as containing the intact coding 
sequence. pMON20923 was identified as containing the mature coding 
sequence. 

These vectors were introduced into disarmed Agrobacterium host 
ABI and used to transform tobacco in tissue culture. Selection for glypho- 

10 sate resistance led to several transformed lines for each vector. For 
pMON20913, four out of 18 lines were confirmed as 3-hydrox3rsteroid 
oxidase expressors by Western blot assay. For pMON20923, four out of 
29 lines were confirmed as 3-hydro^steroid oxidase expressors by 
Western blot assay. Analysis of the lines transformed with pMON20913 

15 confirm the presence of the 3-hydro^rsteroid oxidase activity* Thehi^est 
expressing plant had an expression level of 0.2% of total protein, which is 
equivalent to 54 ng of enzyme per mg of wet tissue. 

Vectors containing the intact or mature 3-hydro^steroid oxidase 
cassette express the active enzyme in the cytoplasm of the plant cell. 

20 There has been no evidence of secretion outside the transformed cells. 
Some bacterial secretory signal sequences have been shown to function in 
plant cells. It may be desirable to direct most or all of the 3-hydroxy- 
steroid oxidase protein into the plant secretory pathway. To achieve this, 
it may be advantageous to use a signal sequence derived fr^m a plant 

25 gene rather than a bacterial signal. An example of such. a signal is that 
from the tobacco PRlb gene, described by Comelissen et al. pMON 
10824, disclosed in EP Publ. 0 385 962, is a plant transformation vector 
d^gned for the expression of the lepidopteran active B.t kurstaki protein. 
In pMON10824, the B,tk. coding sequence is fused to the PRlb signal 

30 sequence plus 10 amino adds of the mature PRlb coding sequence. This 
B.tk. fiision gene is driven by the CaMV35S promoter containing a 
duplicated enhancer. Other vectors csrrymg the PRIB signal and 
CaMV35S promoter may also serve as the soiurce of these elements. To 
create a vector in which the PRlb signal is fused to the 3-hydrox3rsteroid 

35 oxidase gene, a DNA fi:*agment containing the CaMV35S promoter 
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sequence and the PRIB signal sequence is used to replace the fragment 
containing the FMV promoter and HSP70 leader sequences in pMON 
20913 and pMON20923. This results in plasmids in which the 3-hydro:gr- 
steroid oxidase coding sequence (either mature protein or intact protein 
5 cassette) is fused to the amino terminal secretory signal from the PRlb 
gene and is driven by the CaMV35S promoter, Ncol restriction enzyme 
sites at the 3' end of the PRlB signal-containing fragment and the 5' end 
of the modified 3-hydroxysteroid oxidase protein coding sequence allows in- 
frame translational fusions between the two elements, pMON20930 and 

10 pMON20932 are plant transformation vectors that carry such fusions 
between PRIB signal sequence and either the intact cholesterol oxidase 
and the mature cholesterol oxidase, respectively. A giTn^]fir plasmid may 
be constructed wherein the 3-hydro^steroid oxidase gene is under the 
control of the FMV35S promoter. Such plasmids are mobilized into a 

15 disarmed Agrobacterium host and used to transform dicots. Thus, these 
plants produce a S-hydro^steroid oxidase that is secreted into the 
extracellular space. 

In some cases, proteins that enter the plant secretory pathway are 
targeted to different cellular compartments such as the vacuole. It may 

20 be desirable to direct the S-hydroxysteroid oxidase to the vacuole of plant 
cells. In this case, the vectors described above in which the PRlb signal is 
used are further modified to include vacuolar targeting sequences derived 
from known plant vacuolar enzyme genes. 

It may also be desirable to retain the 3-hydroxysteroid oxidase in 

25 the limien of the endoplasmic reticulum (ER). In this case, vectors in 
which the PRlb signal is used to target the protein to the secretory 
pathway are further modified to include sequences known to encode ER 
retention signals. These sequences are added such that the four-amino 
add-long ER retention peptides (such as HDEL (SEQ ID NO: 15) and 

30 KDEL (SEQ ID N0:16) are fiised to the carboxy terminus of the 3- 

hydroxysteroid oxidase. Site-directed oHgonudeotide mutagenesis is used 
to introduce these additions to the carboxy-terminus coding sequence of 3- 
hydroxysteroid oxidase. pMON20937 and pMON20938 are vectors in 
which the peptides HDEL (SEQ ID N0:15) and RGSEKDEL (SEQ ID 

35 N0:17) are introduced, respectively, into pMON20932. 
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It may also be advantageous to direct the localization of the 
3-hydroxysteroid oxidase protein to another cellular compartment, the 
chloroplast. Proteins can be directed to the chloroplast by including at 
their N-termini a chloroplast transit peptide (CTP). One CTP that has 
5 worked to localize heterologous proteins to the chloroplast is that derived 
from the RUBISCO small subimit gene of Arabidopsis, denoted atslA. A 
variant of this transit peptide that encodes the transit peptide, 24 amino 
adds of mature RUBISCO sequence, plus a reiteration of the transit pep* 
tide cleavage site has been constructed for the successful chloroplast 

10 localization of the B.tk. protein (Wong, 1992). Vectors containing the 
Arabidopsis atslA transit peptide fused to the B.^^. gene may be used as 
the base for constructing vectors for the chloroplast localization of the 
3-hydroxysteroid oxidase protein. For example, pMON10817, constructed 
of the Arabidopsis RUBISCO small subunit promoter from the atslA 

15 gene, the native atslA 5' untranslated leader, the atslA chloroplast 
transit peptide variant described above and a truncated B.tk. enzyme 
gene is digested with restriction enzymes NotI and NcoL * A NotI - Ncol 
DNA fragment containing the promoter, leader and transit peptide is used 
to replace a NotI - Ncol fragment carrying the FMV promoter and HSP70 

20 leader sequences in pMON20913 and pMON20923. These reactions 

construct plasmids pMON20929 and pMON20931 in which the 3-hydr(UQr- 
stennd oxidase coding sequence (either intact protein or mature protein 
cassette, respectively) is fused at its amino terminus to the chloroplast 
transit peptide and is transcribed from the atslA promoter. Alternatively, 

25 a similar plasmid may be constructed to replace the atslA promoter with 
the CaMV35S or FMV35S promoters. Such plasmids are mobilized into 
disarmed Agrobacterium hosts and used to transform dicots. Thus, these 
plants produce a 3-hydro^steroid oxidase that is localized to the * 
chloroplast. 

30 It may also be advantageous to direct the 3*hydroxysteroid oxidase 

to another subcellular compartment, the mitochondria. Examples of 
proteins that are normally localized to the mitochondria and for which the 
targeting sequences are known include cytochrome 01 which is localized 
to the mitochondrial intermembrane space (Braxm, 1992) and the B 

35 subunit of ATP ^mthase which is localized to the mitochondrial matrix 
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(Boutry, 1987). Mitochondrial targeting DNA sequences encoding the 
amino terminal mitochondrial targeting peptides are isolated using the 
polymerase chain reaction (PGR). Oligonucleotide primers corresponding 
to the amino and carboxy terminal portions of these targeting sequences 
5 are used to PCR-amplify first strand cDNA that has been generated from 
mRNA using reverse transcriptase. The oUgonucleotide primers have 
Ncol restriction enz3ane sites attached so that the amplified product, 
when cleaved with Ncol, can be inserted into the Ncol sites at the amino 
terminus of the cloned 3-hydroxysteroid oxidase gene cassettes. Thus, 
10 various forms of 3-hydroxysteroid oxidase fused to mitochondrial targeting 
peptides can be expressed in plants using promoters such as CaMV35S or 
FMV35S, and can be localized to the mitochondria. 

Stable Transformation of Monocots 
15 A 3-hydroxysteroid oxidase gene may be stably incorporated into 

the genome of monocots using the vectors and methods described in WO 
93/19189 (Brown etal.). The gene can be mserted in an appropriate 
vector, for example pMON19653 and pMONl9643, described by Brown et 
al. The resulting construct contains a cassette of the CaMV E35S 
20 promoter, the Hsp70 intron, the CP4 glyphosate selection miarker, and 
the NOS terminator; a cassette of the CaMV E35S promoter, the Hsp70 
intron, the GOX glyphosate selection marker, and the NOS terminator; 
and a single NotI site for insertion of a gene expression cassette 
containing a 3-hydro3Qrsteroid oxidase gene, such as SEQ ID N0:13 or 
25 SEQIDN0:14. 

This vector is inserted by bombardment of embryogenic tissue 
culture cells using a biolistic particle gun as described by Brown et al. 
Transformed cells are selected for glyphosate resistance and whole plants 
are regenerated. Insect-resistant plants may be confirmed to be 
30 expressing the gene by Western blot analysis, esterase activity assay, or 
insect resistance assay. 

Targeting of the protein to certain cellular compartments is also 
possible in monocots using the signal sequences described above. 

From the foregoing, it will be seen that this invention is one well 
35 adapted to attain all the ends and objects herdnabove set forth together 
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with advantages which are obvious and which are inherent to the inven- 
tion. It will be understood that certain featxires and subcombinations are 
of utility and may be employed without reference to other features and 
subcombinations. This is contemplated by and is within the scope of the 
5 claims. Since many possible embodiments may be made of the invention 
without departing from the scope thereof, it is to be imderstood that all 
matter herein set forth or shown in the accompanying drawings is to be 
interpreted as illustrative and not in a limiting sense. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 (i) APPLICANT: 

(A) NAME: Monsanco Coxnpany 

(B) STREET: 800 North Lindbergh Boulevard 

(C) CITY: St. Louis 

(D) STATE: Missouri 

10 (E) COUNTRY: Ubited States o£ America 

(F) POSTAL CODE (ZIP) : 63167 

(G) T£LEPHC»IE: (314)694-3131 

(H) TELEFAX: (314)694-5435 

15 (ii) TITLE OF INVENTION: Method o£ Controlling Insects 

(iii) NUMBER OF SEQUENCES: 14 

(iv) COMPUTER READABLE FORM: 
20 (A) MEDIUM TYPE: Floppy dis]c 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: FC-OOS/MS-OOS 

(D) SOFTKARE: Patent In Release «1.0, Version #1.25 (EPO) 

25 (Vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/762682 

(B) FILING DATE: 23-SEP-1991 

(Vi) PRIOR APPLICATION DATA: 
30 (A) APPLICATION NUMBER: US 07/937195 

(B) FILING DATE: 09-SEP-1992 

(Vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/083948 
35 (B) FILING DATE: 28-JUN-1993 



(2) INFORMATION FOR SEQ ID N0:1: 

40. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

Val Ser Thr Leu Met Leu Glu Met Gly Gin Leu Trp Asn Gin Pro 
50 1 . 5 10 15 



(2) INFORMATION FOR SEQ ID N0:2: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino aCidS 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

60 (ii) MOLECULE TYPE: peptide 
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<xi) SEOOENCE DESCRIPTION: SEQ ID NO: 2: 

Ala Phe Ala Asp Asp Phe Cys Tyr His Pro Leu Gly Gly Cys Val Leu 
15 10 IS 



(2) INFORMATION FOR SEO ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 12 amino acids 

<B} TYPE: amino acid 
(0) TOPOLOGY: linear 



15 



20 



(11) MOLECULE TYPE: peptide 

(Xi) SEQfOENCE DESCRIPTION: SEQ ID NO:3: 

Asn Leu Tyr Val Thr Asp Gly Ser Leu lie Pro Gly 
15 10 



<2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDBDNESS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DHA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GTGTCCACCC TGATGCTGGA GATGC^GCCAG CTGTGGAACC AGCCC 45 

35 

(2) IMFOSHATION FOR SEQ ZD NOsS: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUE N C E DESCRIPTION: SEQ ID NO: 5: 
GCCTTCGCC6 ACGACTTCTG CTACCACCCG CTCGGCGGCT GCGTCCTG 48 

50 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 (ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
AACCTCTACG TGACCGACGG TTCGCTGATC CCGGGT 36 

5 

(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1865 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDKESS: single 

(D) TOPOLOGY: linear 



15 



(ii) ia)LECULE TYPE; DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
CTACTCCATG GCGTGCTGAA GGTCGGTGCC TGGCCTCCCG AGGTCGTCGA GGACTTCGTG 60 
20 AAGTGAGCGG GCACCCCGCC CGTCCCCGCC CCGCAACGGC CCGTTCCGCJV CACCGGGTGA 120 
CCCGACCCCC TCGGCCCCCG ACCTCCGCCG ACCTCTCAGT CCCCTCTCGA AGCTCAGGAG 180 
CAACAGCGTG AACGCACACC AGtXTCTGTC GCCCCGCCGC ATGCTCGGCC TGGCCGCCTT 240 

25 

GGGCGCCGCC GCACTCACCG GGCAGACCAC GATCACCGC6 GCCCCCCGCG CmXOCCGC 300 
CACCGCCCCC GGC GG CT C CG GC6GCACGTT CGTGCCCGCC GTCGTQATCG GCA0C6GCTA 360 
30 C66CGCGGCC GTCTCCGCCC TGCGGCTCGG CGAGGCCGGG G TCTC C ACCC TGATGCTGCSA .420 
GAT6GGCCAG CTGTGGAACC AGCCCGGCCC GGACGGCAAC GTCTTCTGCG OGATGCTCAA 480 
GCCCGACAAG CGCTCCAGCT CGTTCAAGAC CCGCACCGA6 GCCCCGCTCC GCTCCTTC C T 540 

35 

CTGGCTCGAC CTCGCCAACC GG6ACATCGA CCCCTACGC6 GGCGTCCTGG ACCGGGTCAA 600 
CTTCGACCAG ATGTCCGTGT ACGTGGGCCG CGGGGTCGGC GGCGGCTCGC TCGTCAACGG 660 
40 CGGTATGGCC GTCACGCCCC GGCGCTCCTA CTTCCAGGAG GTGCTGCCCC AGGTCGACGC 720 
CGACGAGATG TACGGCACCT ACTTCCC6C6 CGCGAACTCC GGCCTGCG6G TCAACAACAT 780 
CGACAAGGAC TGGTICGAGC AGACCGAGTG GTACACGTTC GCGCGCGTT6 CCCGTCTOCA 840 

45 

GGCCGAGAAC GCOGGCCTGA AGACCACCTT CGTGCCCAAC GTCTACGACT GGGACTACAT 900 
GCGCGGT6A6 GCGGACGGCA CCAACCCCAA 6TCC0CGCTC GCCGCCGAGG TCATCTACGG 960 
50 CAACAACCAC GGCAACGTCT CCCTCGACAA GAGCTACCTG GCGGCCGCCC TGGGCACCGG 1020 
CAAGGTCACC GTCGAGACCC TGCACCAGGT CAAGACGATC CGTCAGCAGA ACGACGGCAC 1080 
CTACCTGCTG ACGGTCGAGC AGAAGGACCC CGACGGCAAG CTGCTCGGGA CCAAGQAGAT 1140 

55 

CTCCTGCC<3C CACCTCTTCC TCGGCGCCGG CAGCCTCGGC TCCATTGAAC TGCTGCTGCG 1200 
CGCCCGG6AG ACCGGCACCC TGCCCGGCCT CAGCTCCGAG ATCGGCGGCG OCTGGGGCCC 1260 
60 CAACGGCAAC ATCATGACCG CCCGCGCCAA CCATGTGTGG AACCCCACGG GCAGCAAGCA 1320 
GTCGTCGATC CCCGCCCTCG GCATCGACGA CT6GGACAAC CCCGACAACC CCGTCtT CGC 1380 
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CGAGATAGCe CCCAT6CCGG CGGGCCTCGA GACCTGGGTC AGCCTCTACX: TGGCCATCAC 1440 
CAAGAACCCG GA6CGCGGCA CCTTCGTCTA CGACGCCGCC AAGGACCGGG CGGACCTGCG 1500 

5 

CTGGACCCGG GACCAGAACG CGCCCGCGGT CGCCGCCGCC AAGTCGCTGT TCGACCGCGT 1560 
CAACAAGGCC AACACGACCA TCTACCGGTA CGACCTCTTC GGCAAGCAGA TCAAGGCGTT 1620 
10 CGCCGACGAC TTCTGCTACC ACCCGCTCGG CGGCTGCGTC CTCGGCAAGG CCACCGACAA 1680 
CTACGGCCGC GTCTCCGGGT ACAAGAACCT CTACGTCACC GACGGCTCOC TCATCCCCGG 1740 
CAGCATCGGC GTCAACCCGT TCGTGACCAT CACGGCGCTG GCGGAGCGGA ACGTCGAGCG 1800 

15 

CGTCATCAAG GAGGACATCG CGGGTTCCTG ACGAOCGACG 6GCGGGGC6C GGCATGCAAG 1860 
CTT66 1865 

20 

(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 amino acids 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SBO ID N0:8: 

Val Asn Ala Kis Gin Pro Leu Ser Arg Arg Arg Met Leu Gly Leu Ala 
15 10 15 . 

35 Ala Leu Gly Ala Ala Ala Leu Thr Gly Gin Thr Thr lie Thr Ala Ala 

20 25 30 

Pro Arg Ala Ala Ala Ala Thr Ala Pro Gly Gly Ser Gly Gly Thr Phe 
35 40 45 

Val Pro Ala Val Val lie Gly Thr Gly Tyr Gly Ala Ala Val Ser Ala 
50 55 60 



40 



Leu Arg Leu Gly Glu Ala Gly Val Ser Thr Leu Met Leu Glu Met Gly 
45 65 70 75 80 

Gin Leu Trp Asn Gin Pro Gly Pro Asp Gly Asn Val Phe Cys Gly Met 
.85 90 95 

50 Leu Lys Pro Asp Lys Arg Ser Ser Trp Phe Lys Thr Arg Thr Glu Ala 

100 105 110 



55 



Pro Leu Gly Ser Phe Leu Trp Leu Asp Leu Ala Asn Arg Asp He Asp 
115 120 125 

Pro Tyr Ala Gly Val Leu Asp Arg Val Asn Phe Asp Gin Met Ser Val 
130 135 140 



Tyr Val Gly Arg Gly Val Gly Gly Gly Ser Leu Val Asn Gly Gly Met 
60 145 150 155 160 

Ala Val Thr Pro Arg Arg Ser Tyr Phe Gin Glu Val Leu Pro Gin Val 
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165 170 175 

Asp Ala Asp Glu Met Tyr Gly Tbr lyr Phe Pro Arg Ala Asn Ser Gly 
180 185 190 

5 

Leu Arg Val Asn Asn lie Asp Lys Asp Trp Phe Glu Gin Thr Glu Trp 
195 200 205 

Tyr Thr Phe Ala Arg Val Ala Arg Leu Gin Ala Glu Asn Ala Gly Leu 
10 210 215 220 

Lys Thr Thr Phe Val Pro Asn Val Tyr Asp Trp Asp Tyr Met Arg Gly 
225 230 235 240 

15 Glu Ala Asp Gly Thr Asn Pro Lys Ser Ala Leu Ala Ala Glu Val lie 

245 250 255 



20 



35 



50 



Tyr Gly Asn Asn His Gly Lys Val Ser Leu Asp Lys Ser Tyr Leu Ala 
260 265 270 

Ala Ala Leu Gly Thr Gly Lys Val Thr Val Glu Thr Leu His Gin Val 
275 280 285 



Lys Thr lie Arg Qln Gin Asn Asp Gly Thr Tyr Leu Leu Thr Val Glu 
25 290 295 300 

Gin Lys Asp Pro Asp Gly Lys Leu Leu Gly Thr Lys Glu lie Ser Cys 
305 310 3X5 320 

30 Arg His Leu Phe Leu Gly Ala Gly Ser Leu Gly Ser lie Glu Leu Leu 

325 330 335 



Leu Arg Ala Arg Glu Thr Gly Thr Leu Pro Gly Leu Ser Ser Glu lie 
340 345 350 

Gly Gly Gly Trp Gly Pro Asn Gly Asn lie Mat Thr Ala Arg Ala Asn 
355 360 365 



His Val Trp Asn Pro Thr Gly Ser Lys Gin Ser Ser He Pro Ala Leu 
40 370 375 380 

Gly He Asp Asp Trp Asp Asn Pro Asp Asn Pro Val Phe Ala Glu Zle 
385 390 395 400 

45 Ala Pro Met Pro Ala Gly Leu Glu Thr Trp Val Ser Leu Tyr Leu Ala 

405 410 415 



Zle Thr Lys Asn Pro Glu Arg Gly Thr Phe Val Tyr Asp Ala Ala Lys 
420 425 430 

Asp Arg Ala Asp Leu Arg Trp Thr Arg Asp Gin Asn Ala. Pro Ala Val 
435 440 445 



Ala Ala Ala Lys Ser Leu Phe Asp Arg val Asn Lys Ala Asn Thr Thr 
55 " 450 455 460 

Zle Tyr Arg Tyr Asp Leu Phe Gly Lys Gin Zle Lys Ala Phe Ala Asp 
465 470 475 480 



60 



Asp Phe Cys Tyr His Pro Leu Gly Gly Cys Val Leu Gly Lys Ala Thr 
485 490 495 
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Asp hsn Tyr Gly Arg Val Ser Gly Tyr Lys Asn Leu Tyr val Thr Asp 
500 505 510 

Gly Ser Leu lie Pro Gly Ser He Gly Val Asn Pro Phe Val Thr He 
5 515 520 525 

Thr Ala Leu Ala Glu Arg Asn val Glu Arg Val He Lys Glu Asp He 
530 535 540 

10 Ala Gly ser 

545 



(2) INFORMATION FOR SEQ ID N0:9: 

16 

(1) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 27 base paixs 

(B) TYPE: nucleic acid 

(C) STRANDEZ3NESS : single 

(D) TOPOLOGY: linear 



20 



25 



(11) NOLECOLE TYPE: SNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CTCAGGAGCA CCAT6GCGAC CGCACAC 27 



(2) INFORMATION FOR SEQ ID NO:10: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 



40 



(ii) MOLECULE TYPE: DNA (gencmic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTGCCGCCGG AG6CCATGGG GGCGGTGGC 29 



(2) INFORMATION FOR SEQ ID N0:11: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



50 



55 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCCCCGCCCG TCGGATCCGT CAGGAACCCG 30 



(2) INFORMATION FOR SEQ ID NO: 12: 

60 

(i) SEQUENCE. CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
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(B) TYPE: amino acid 
' (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 

(xi) SEQDBNCB DESCRIPTION: SEQ ID N0:12: 

Ser Gly Gly Thr Phe 
1 5 

10 

(2) INFORM31TI0N FOR SEQ ZD N0:13: 

(i) SEQOEMCE CHARACTERISTICS: 
15 (A) LENGTH: 1647 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEQNESS: Single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DKA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID M0:13: 
ATGGCGACCG CACACCAGCC TCT G T C GCGC CGCCGCATGC TCGGCCTGGC CGCCTTGGGC 60 

25 

GCCGCCGCAC TCACCGG6CA 6ACCACGATC ACCGCGGCCC CCC6CGCGGC CGCCGCCACC 120 

GCCCCCGGCG GCTCCGGCG6 CACGTTCGT6 CCCGCCGTCG TGATCGOCAC CGGCTACGGC 180 

30 GCGGCCGTCT CCGCCCTGC6 GCTCGCCGAC GCCGGGGTCT CCACCCTGAT GCTGGAGJCtG 240 

GGOCAGCTGT GGAACCAGCC CGGCCCGGAC GGCAACGTCT TCTGCG GG AT GCTCAAGCCC 300 

GACAAGCGCT CCAGCTGGTT CAAGACCCGC ACCGAGGCCC CGCTCGGCTC CTTCCTCTGG 360 

35 

CTCGACCTCG CCAACCOGGA CATCGACCCC TACGCGGOCG TCCTGGACCG GGTCAACTTC 420 

GACCAGATGT CCGTGTACGT GGGCCGCGGG GTCGGCGGCG GCTCGCTCGT CAACGGCGGT 480 

40 ATGGCCGTCA CGCCCCGGCG CTCCTACTTC CAGGAGGTGC TGCCCCAGGT CGACGCCGAC 540 

GAGATGTACG GCACCTACTT CCCGCGCGCG AACTCCGGCC TGCGGGTCAA CAACATCGAC 600 

AACGACTGGT TCGAGCAGAC CGAGTGGTAC ACGTTCGCGC GCGTTQCCCG TCTGCACGCC 660 

45 

GAGAACGCCG GCCTGAAGAC CACCTTCGTG CCCAACGTCT ACGACTGGGA CTACATGCGC 720 

GGTGAGGCGG ACGGCACCAA CCCCAAGTCC GCGCTCGCCG CCGAGGTCAT CTACGGCAAC 780 

50 AACCAOGOCA AOGTCTCCCT CGACAAGAGC TACCTGGCGG CCGCCCTGGG CACCGGCAAG 840 

gtcaccgtcg AGACCCTGCA CCAGGTCAAG ACGATCCGTC AGCAGAACGA CGGCACCTAC 900 

CTGCTGACGG TCGAGCAGAA GGACCCCGAC GGCAAGCTGC TCOGGACCAA GGAGATCTCC 960 

55 

TGCCGCCACC TCTTCCTCGG CGCCGGCAGC CTCGGCTCCA TTGAACTGCT GCTGCGCGCC 1020 
CGGGAGACCG GCACCCTGCC CGGCCTCAGC TCCGAGATCG GCGGCGGCTG GGGCCCCAAC 1080 
60 GGCAACATCA TGACCGCCCG CGCCAACCAT GT G TG G AACC CCACGGGCAG CAACCAGTCG 1140 
TCGATCCCCG CCCTCGGCAT CGACGACTGG GACAACCCCG ACAACCCCGT CTTCGCCGAG 1200 



r 
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ATAGCCCCCA TGCCGGCGGG CCTCGAGACC TGGGTCAGCC TCTACCTGGC CATCACCAAG 1260 
AACCCGGAGC GCGGCACCTT CGTCTACGAC GCCGCCAAGG ACCGGGCGGA CCTGCGCTGG 1320 

5 

ACCC6G6ACC AGAACGCGCC COCOC3TC6CC 6CCGCCAA6T CGCTGTTCGA CCGGGTCAAC 1380 
AAGGCCAACA CGACCATCTA CCGGTACGAC CTCTTCGGCA AGCAGATCAA GGCGTTCGCC 1440 
10 GACGACTTCT GCTACCACCC GCTCGGCGGC TGCGTCCTCG GCAAGGCCAC CGACAACTAC 1500 
GCCC6CGTCT CCGGGTACAA GAA C CTCT A C GTCACXTGACG GCTCGCTCAT CCCCGGCAGC 1560 
ATCGGCGTCA ACCCGTTCGT GACCATCAC6 6CGCTGGCGG AGCGGAACGT CGAGCGCGTC 1620 

15 

ATCAAGGAGG ACATCGCGGQ TTCCTGA 1647 
(2) ZMFORMATZOM FOR SEQ ZD N0:14: 

20 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1521 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPGLOGV: linear 

(ii) MOLECULE TVFE: UNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0il4x 

30 

ATGGCCTCCG GCGGCACCTT CGTGCCCGCC GTCGTGATCG GCACCGGCTA CGGCCCGGCC 60 

GTCTCCGCCC TGCGGCTCGG CGAGGCCGGG GTCTC CA CCC TGATGCTGGA GATOGGCCAG 120 

35 CTGT6GAACC AGCCCGOCCC GGACGGCAAC GTCTTCTGC6 GGAT6CTCAA GCCCGACAAG 180 

CGCTCCAGCT 6GTTCAAGAC CCGCACCGA6 GCCCCGCTCG GCTCCTTCCT CTGGCTCGAC 240 

CTCGCCAACC G6GACATCGA CCCCTA0GC6 G6CGTCCTGG ACCGGGTCAA CTTCGACCAG 300 

40 

ATGTCCGTGT ACGTGGGCCG CG6GCSTCGGC G6CGGCTCGC TCGTCAACGG C6GTATG6CC 360 

GTCACGCCCC OGCGCTCCTA C TT CCA GGAG GTGCTGCCCC AGGTCGACOC CGAC6AGATG 420 

45 TACGGCACCT ACTTCCCGC6 CGCGAACTCC GGCCTGCG6G TCAACAACAT CGACAAGGAC 480 

TGGTTCGAGC AGACCGAGTG GTACACGTTC GCGCGCGTTG CCCGTCTGCA GGCCGAGAAC 540 

GCCGGCCTGA AGACCACCTT CGTGCCCAAC GTCTACGACT GG6ACTACAT GCGCGGTGAG 600 

50 

GCGGACGGCA OCAACCCCAA GTCCGCGCTC GCCGCCGAGG TCATCTACG6 CAACAACCAC 660 

GGCAAG6TCT CCCTCGACAA GAGCTACCTG GCGGCCGCCC TGGGCACCGG CAAGGTCACC 720 

55 GTCGAGACCC TGCACCAGGT CAAGACGATC CGTCAGCAGA ACGACGGCAC CTACCTGCTG 780 

ACGGTCGAGC AGAAGGACCC CGACGGCAAG CTGCTCGGGA CCAAGGAGAT CTCCTGCCGC 840 

CACCTCTTCC TCGOCGCCGG CAGCCTC G GC TCCATT6AAC TGCTGCTGCG CGCCCGGGAG 900 

60 

ACCGGCACCC TGCCCGGCCT CAGCTCCGAC ATCGGCGGCG GCTGGGGCCC CAACGGCAAC 960 
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ATCATGACCG CCCGCGCCAA CCATGTGTGG AACCCCACGG GCAGCAAGCA GTCGTCGATC 1020 
CCCGCCCTCC GCATCGACGA CTGGGACAAC CCCGACAACC CCGTCTTCGC CGAGATAC3CC 1080 
5 CCCATGCCGG CGGGCCTCGA GACCTGGGTC AGCCTCTACC TGGCCATCAC CAAGAACCCG 1140 
GAGCGCGGCA CCTTCGTCTA CGACGCCGCC AAGGACCGGG CGGACCTGCG CTGGACCCGG 1200 
GACCAGAACG CGCCCGCGGT CGCCGCCGCC AAGTCGCTGT TCGACCGCGT CAACAAGGCC 12€0 
AACACGACCA TCTACCGGTA CGACCTCTTC 6GCAAGCA6A TCAAGGCGTT CGCCGACGAC 1320 
TTCTGCTACC ACCCGCTC66 CGGCTGCGTC CTCGGCAAGG CCACCGACAA CTACGGCCCC 13 BO 
15 GTCTCCGGGT ACAAGAACCT CTACGTCACC GACGGCTCGC TCATCCCCGG CAGCATCGGC 1440 
GTCAACCCGT TCGTGACCAT CACGGCGCTG GCGGAGCOGA ACGTCGAGCG CGTCATCAAG 1500 
GAGGACATCG CGGGTTCCTG A 1521 



10 



20 



(2) ZNFOBMATIQN FOR SEQ ID NO: 15: 



(i) SEQOEMCE CHARACTERISTICS: 
25 (A) LENGTH: 4 amino acids 

(B) TTPEs amino acid 
(D) TOPOLOGY: linear 



30 



35 



(ii) MOLBCOLE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 15: 

His Asp Glu Leu 
1 

(2) INFORMATION FOR SEQ ID N0:16: 



(i) SEQOENCE CHARACTERISTICS: 
(A) LBHCmz 4 amino acids 
40 <B) TVTE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

Lys Asp Glu Leu 
1 

50 (2) INFORMATION FOR SEQ ID NO: 17: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: S£Q ID N0:17: 



60 



Arg Gly Ser Glu Lys Asp Glu Leu 
1 5 
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WHAT IS CLAIMED IS: 

1. A method of controlling lepidopteran insect infestation of plants 
comprising providing a 3-hydroxysteroid oxidase for ingestion by the 
insect. 

5 2. The method of Claim 1 wherein the insect is in a larval stage. 

3. The method of Claim 1 wherein said S-hydroxysteroid oxidase is 
provided by plant-colonizing microorganisms which produce 
3-hydroxysteroid oxidase after appUcation to the plant 

4. The method of Claim 1 wherein said 3-hydroxysteroid oxidase is 

10 provided by expression of a gene for 3-hydro^steroid oxidase incorporated 
in the plant by previous genetic transformation of a parent cell of the 
plant 

5. The method of Claim 4 wherein said plant is cotton or com. 

6. A method of producing a genetically transformed plant which 
15 expresses an amo\mtofa3-hydrox3^teroid oxidase effective to control 

lepidopteran insect infestation, comprising the steps at 

a) inserting into the genome of a plant cell a recombinant, 
doiible-stranded DNA molecule comprising 

(i) a promoter which functions in plant cells to cause the 
20 production of an BNA sequence; 

(ii) a structural coding sequence that encodes for 
3-hydroxysteroid oxidase; 

GS) a 3' non-translated region which functions in said plant 
cells to cause the addition of polyadenylate nucleotides to the 

25 3* end of the RNA sequence, 

wherein said promoter is heterologous with respect to said 
structural coding sequence and wherein said promoter is 
operatively linked with said structural coding sequence, which is in 
turn operably linked with said non-translated region; 

30 b) obtaining transformed plant cells; and 

c) regenerating &om the transformed plant cells genetically 
transformed plants with express an amoxmt of 3-hydroxysteroid 
oxidase effective to control lepidopteran insect infestation. 

7. The method of Claim 6 wherein said structural DNA sequence 
35 comprises SEQ ID N0:13 or SEQ ID NO: 14. 
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8. The method of Claim 6 wherein said plant cell is a cotton or com 
plant cell. 

9. The method of Claim 6 wherein wherein the genome of said plant 
cell also contains one or more genes expressing B.t endotoxins. 
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AF263912 standard; DNA; PRO; 123580 BP. 

AF263912; 

AF263912.1 

25-MAY-2000 (Rel . 63, Created) 

25-MAY-2000 (Rel. 63, Last updated, Version 1) 

Streptomyces noursei ATCC 11455 nystatin biosynthetic gene cluster, 
complete sequence. 



Streptomyces noursei 

Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales ; 
Streptomycineae ; S treptomycetaceae ; Streptomyces . 

[1] 

1-123580 

Brautaset T., Sekurova O.N., Sletta H., Ellingsen T.E., Strom A.R., 

Valla S., Zotchev S.B.; 

"Biosynthesis of the polyene antifungal antibiotic nystatin in Streptomyces 
noursei ATCC 11455: analysis of the gene cluster and deduction of the 
biosynthetic pathway"; 
Chem. Biol. 7 (6) : 395-403 (2000) . 

[2] 

1-123580 

Brautaset T., Sekurova O.N., Sletta H., Ellingsen T.E., Strom A.R., 
Valla S., Zotchev S.B.; 

r 

Submitted {04-MAY-2000) to the RL Unigen, NTNU, O. Kyrres gt. 3, Trondheim N- 
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SPTREMBL 
SPTREMBL 
SPTREMBL 
SPTREMBL 
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SPTREMBL 
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SPTREMBL 
SPTREMBL 
SPTREMBL 
SPTREMBL 

Key 

source 



Q9L4V6; Q9L4V6 . 
Q9L4V7; Q9L4V7 . 
Q9L4V8; Q9L4V8 . 
Q9L4V9; Q9L4V9 . 
Q9L4W0; Q9L4W0 . 
Q9L4W1; Q9L4W1 . 
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1. .123580 



http://www.ebi.ac,uk/cgi-bin/emblfetch?id=AF263912&Submit=Go lO/l 1/2002 Christina Belisario 



Page 2 of 51 

FT /db_xref = " taxon : 1971 " 

FT /organism="Streptomyces noursei" 

FT /strain="ATCC 11455" 

FT CDS complement (46. .783) 

FT /codon_start=l 

FT /db_xref = "SPTREMBL:Q9MX7" 

FT /note="putative 4 ' -phosphopantheteine transferase" 

FT /transl_table=ll 

FT /gene="nysF" 

FT /function^ "presumably post-translationally modifies ACP 

FT domains on PKS" 

FT / produc t = " Ny s F " 

FT /protein_id="AAF71762 . 1" 

FT /translation="MIELILPATVATEAAYDDRPRPGDRLLSSEREVIARAVESRQREF 

FT TTVRHLARRALRRLGHPDRAILPNRRGAPQWPPGIVGSMTHCAGYRAAAVSPAELSAAV 

FT SIDAEPNGPLPAGVLNAIALPSERPHLVALAAHRPDVHWDRLLFSAKESVFKAWYPLTQ 

FT RELDFSEAEIVIDPTQGAFTARLLVPGPLLGGRRVTVFPGRWHSTPALLTTAVHLPAPT 

FT PRRDREHRTHLTVNS PLPRPTFG " 

FT CDS complement (867. .2684) 

FT /codon_start=l 

FT /db_xref="SPTREMBL:Q9L4X6" 

FT /note= "putative transporter (ABC family) " 

FT /transl_table=ll 

FT /gene="nysG" 

FT /function= "presumably involved in efflux of nystatin" 

FT /product="NysG" 

FT /protein_id="AAF71763,l" 

FT / 1 r ans 1 a t i on= " MAS PDDLEEERTAPRPARRLVGLLRPHRRS VALAVSMGVGGI VLN 

FT AFGPLLLGRVTDLIADGVLGGVPGPAPGIDFAAIGRLLLVLLALYWASLFMLAQGRLV 

FT ASAVWRTIHELRRDAREKLTRLPLRHFDRQPAGELLSRTTNDIDNLQQTLQQTLAELIT 

FT SI FSLLTMLVLMLVI S PSLAWMLLSVPVSALI AARI SKRAQPHYAAQWS ANGTLNAHV 

FT EEVCTGHALIKGFDRRAAAEERFDACNDAVYRAAAKAQFASGAMEPVMMFVANLGYVLV 

FT AVIGAWKVINGTLTLGDVQAFILYARQFSQPIVEIASVAGRLQSGIASAQRVFTLLDAP 

FT EQAPDPLRPGTPARAEGRVEFTDVSFRYSPDTPLIENLSLTVEPGSTVAIVGPTGAGKT 

FT TLGNLLMRFYEPDSGRILLDGTDTATMTRDDLRSRFGLVLQDTWLFGGTIAENIAYGAP 

FT GACRAD I EE AARATCADRF I RTLPQGYDTVLDDESGTVS AGE KQLLTVARAFLARP AVL 

FT VLDE ATS S VDTRTE VL I QRAMNS LRAGRTS FVI AHRLST I RDADL I WMD AGR I VEQGT 

FT HDQLLCAQGLYARLHAARTHTPTAGAAAG " 

FT CDS complement (2662. .4416) 

FT /codon_start=l 

FT /db_xref = " SPTREMBL : Q9L4X5 " 

FT /note="putative transporter (ABC family) " 

FT /transl_table=ll 

FT /gene="nysH" 

FT /function= "presumably involved in efflux of nystatin" 

FT /product^ "NysH" 

FT /protein_id="AAF71764 .1" 

FT /translation="MLLRLLRAQLRPYAWATAALVALQLVQILGTLLLPTLGAALIDQG 

FT WRGDGGRITELGWMGWALVQIAAALGAAAIiAARTATAMGRDLRSALFRRILDFSAR 

FT EIGRFGTPSLLTRSVNDVQQVQNLAQTGFGIWCAPLMCLGSVLLALRQDVPLALLLVA 

FT LVLWAVCFGLLLARMGTLYARMQLTLDRLGRLLREAITGVRWRSFVRDDHERARFAQ 

FT TNDAFLWSRRVGRLIATMLPVVLLLMNGFTVALLWTGSHRIDAGRMPIGSLSALLSYL 

FT SLILMSWMLAFVFLSVPRARVCAGRIAEVLDTGSSVAPPAAPQPVRGPAGRIELCAAG 

FT YRYPGAEEPVLRDVDLTVEPGER lAVLGSTGSGKTTLLNLVLRLADATEGAVRVGGTDV 

FT RELTAATLAAAVGFVPQRPYLFSGTVASNLRFGRPDATDEELWEALRVAQAADFVARMP 

FT DGLDAEITQGGGNVSGGQRQRLSLARALLRRPEIYLFDDCFSALDQATDAALRTALVPY 

FT ' TAGATVITVAQRISAGRDADRIWLDRGRWAQGTHDVLLRTSPTYREIALSQLTEEEA 

FT AHGLAGRP" 

FT CDS 4714.. 5748 

FT /codon_start=l 
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FT 


/db_xref = " SPTREMBL : Q9L4X4 " 


FT 


/note= "putative dGDP-mannose-4 , 6 -dehydratase" 


FT 


. /transl_table=ll 


FT 


/gene="nysDIll" 


FT 


/functions "presumably involved in mycosamine biosynthesis" 


FT 


/product="NysDIII" 


FT 


/protein_id="AAF71765 .1" 


FT 


/translation="MSKRALITGITGQDGSYIjAEHLLSQGYQVWGLIRGQANPRKSRVS 


FT 


RLASELDFIDGDLMDQGSLVSAVDTVQPDEVYNLGAISFVPMSWQQAELVTEVNGMGVL 


FT 


RMLEAIRMVSGLSTSRTVSPRGQIRFYQASSSEMFGKAAETPQRETTLFHPRSPYGAAK 


FT 


AYGHYITRNYRESFGMYAVSGMLFNHESPRRGQEFVTRKISI^VARIKQGLQDKLALGN 


FT 


LDAVRDWGYAGDYVRAMHLMLQQDAGDDYVIGTGQMHSVRDAVRIAFEHVGLNWEDYVV 


FT 


IDPDLVRPAEVEVLCADSAKAQDRLGWKPDVDFPTLMRMMVDSDLAQVSRENQYGDVLL 


FT 


AANW" 


FT CDS 


5930 . .34363 


FT 


/ codon_s t ar t = 1 


FT 


/db_xref= " SPTREMBL : Q9L4X3 " 


FT 


/note="polyketide synthase" 


FT 


/transl_table=ll 


FT 


/gene=" nysl " 


FT 


/functions "responsible for condensation steps 9 to 14 in 


FT 


the nystatin polyketide backbone synthesis" 


FT 


/ p r oduc t = " Ny s I " 


FT 


/protein id="AAF71766 . 1" 


FT 


/translation="MDNEQKLRDYLKLATADLRRTRRRVHKLESAAQEPVAIIGMTCRY 


FT 


PGGVRSPEDLWRMVEAGEHGVTPFPTDRGWDLEALAAAPTASGGFLHDAPDFDADFFGI 


FT 


SPREAVAMDPQQRWLESAWEAFERAGIDPTSVKGSRTGVFIGAMAQDYRVGPADGAEG 


FT 


FQLTGNTGSVLSGRISYTFGTVGPAVTVDTACSSSLVAVHLATQALRAGECTLALAGGV 


FT 


TIMSGPGTFIEMGRQGGLSADGRCRSFGDTADGTGWAEGVGILVLERLSDAVRNGHEIL 


FT 


AWRGTAVNQDGASNGLTAPNGPSQQQVIQQALVNARLAAGDIDWEAHGTGTTLGDPV 


FT 


EAQALLATYGQNRPADRPLLLGSVKSNLSHTQAAAGVAGVIKMVMAMRHGTLPRTLHAE 


FT 


EPTHHVDWSQGAVRLLTDTTDWPATGAPRRAAVSSFGISGTNAHTIIEQAPEPQPEDAA 


FT 


TAQDDAAGSTPATAPWPGWPVLLSGRTPDALRGQAAALRAALDTGRRPDLLDLAHSL 


FT 


ATTRAGFEHRAVLLATDHP ALTDGLTALAD ADDPAAAPAW I TGTTRAETRLAVL FTGQG 


FT 


AQRLGAGRELAARFPAFATALDAALDAFTPHLDRPLREVLWGTDAALLDRTAYAQPALF 


FT 


AVEVALYRLIESFGVRPDHLAGHSVGEIVAAHLAGVLSLADAATLVAARGRLMQALPDG 


FT 


GAMIAVQASEADVAPLLAGHEDQVAIAAVNGPSAWLSGAEATVTALAEQLAADGRKTR 


FT 


RLRVSHAFHSPLMEPMLDAFRAWEDLTLQPPLLPWSNLTGKPATVAQLTSADYWVDH 


FT 


VRHAVRFADGIDWLARHDTTAFLELGPDGVLSAMAQDCLDAADADAVTLPALRAGRPEE 


FT 


HTLTTALAGLHVHGATLDWTGCFAGTGARRTDLPTYAFQRRRYWPKALQSGTADLRSVG 


FT 


LGAAHHPLLSAAVSLADAGGTLLTGRLSRQTHPWLADHTVRGTTLLPGTAFLELAVRAG 


FT 


DEVGCDRVEELTLAAPLLLPEQGGVQVQLWIGNPDVSGRRTVNVHARPDTGDDTPWTAH 


FT 


ATGVLTTADASRQLPASSEQGGTPLAGDPHPALDAAQWPPAGAEPLPLDGHYDRLADGG 


FT 


FGYGPVFQGLRAAWRGGDWYAEVELPEAGRSDAEAFGLHPALLDAALHAAPFTGLGER 


FT 


GRGGLPFSWEGVSLHAGGATTLRVRLTPVADDALALTVADGTGAPVLSVDSLVLRSVAT 


FT 


QQLDTAAAVARDALFRLDWTPVQPTATDPGPVALLGADPFGLLTHAGFADAPAYPDLAA 


FT 


LAAADGPVPTTVVLSLAGTGDDAADPARSAHRCAAEALAAVQTWLDHHERFAAARLVFV 


FT 


TRGATVGRDVAAAAVWGLVRSAQSENPGCFALVDLDPDGAVGAAALVAALVSGEPQLAV 


FT 


RGDVLRVARLVRRPLTEVGAGADGTGDGVGDGSGVSFSGEGAVLVTGGTGGLGAVLARH 


FT 


LVAEYGVRDLLLVSRSGERAVGAGELVAELAGVGARVRWACDVTDRAAWELVGGHAV 


FT 


SAWHAAGVLDDGMVGALTGERLSAVLRPKVDAVWHLHEATRGLDLDAFWFSSLAGVF 


FT 


GSPGQANYAAANAFLDALMTRRRAEGLPGLSLAWGPWEQSGGMTGTLTDVDAERLARSG 


FT 


VPPLSVAQGLALFDAAVAGTDATCVPVRLDLPVLRARGEVPPLLRSLIRVRARRAAVAG 


FT 


SATAGNIAQRLRRLDEDGRDEMVLDLVRGQVALVLGHATGGDVDAGRAFRDLGFDSLTA 


FT 


VELRNRLNTVTGLRLPATLVFDYPTVRHLATYVLDELLGTDAEVATVQPAAVAVADDPI 


FT 


VIVGMACRYPGGVSSPEDLWRVLTEGTDAVSGFPTNRGWDVESLYHPDPDHPGTSYTRS 


FT 


GGFLHEAGEFDPGFFGMSPREALATDSQQRLLLESSWEAIERAGIDPVSLRGSRTGVFA 


FT 


GVMYS D YS AMLJ^ PE FEGFQGSGS S PS LASGR VAYTLGLEGPAVTVDTAC S S S LVAMHW 


FT 


AMQALRSGECGLALAGGVTVMSTPAVFV0FARQRGLSPDGRCKAFADAADGVGWSEGVG 


FT 


VLVLERQSDAVRNGHEILAWRGSAVNQDGASNGLTAPNGPSQQRVIRQALASGGLTAG 
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FT DVDWEAHGTGTTLGDPIEAQALIiATYGRDREPERPLLLGSVKSNLGHTQAAAGVAGVI 

FT KMVLAMRHGWPRTLHVDAPSSHVDWSEGAVELLSEQAAWPETGRVRRAGVSSFGISGT 

FT NVHVIVEQAPGAKAIAAAGAARRTPGAVPVLLSGRGRSALRGQAARLLGHLQARPDAEL 

FT VDVALSLATTRSRFEQRAAWAQDRDQLIASLGALAADRPDPAWEGEAAGRGRTAVLF 

FT TGQGSQRAAMGRELHEVQPEFAAAFDAVCAVFDPLLDRPLREWFAEDGSDEAALLDET 

FT GWTQPALFAVEVALFRLVESWGVRPDFVAGHS IGEIAAAHVAGVLTLEDACRLVAARAT 

FT LMQALPTGGAMIAIQATEDEIAAHLDDTVAIAAVNGPQSWISGDEEAAETIAATFAER 

FT GRKTKRLRVSHAFHSPRMDGMLDAFRIVAEGLTYRAPRIPLVSDLTGRRADDAEVCTAE 

FT YWVRHVREAVRFADCVRTLRDAGATTFLELGSDGLLTAMAEDTLGDDHDAELVPMLRAG 

FT RAEELAAATALARLQVRGVDVDWAAYLAGTGARRTDLPTYAFQHAYYWPQLPTPAAALA 

FT AADPADQQLWAAVERGDARELADILGLGEQDLTPLDSLLPALTSWRRGNQEKHLLDTLR 

FT YRVEWTRLS KPTAPVLDGTWLLVASDATAADQPALLDGLADALGSHGARVRRLLLDDS C 

FT ADRAVIAERLARTADVDAATQVLSVLPLDERDADDCPPLTRGI^TVALVQAIJU^ 

FT GRLWTATRGAVSTNPADPVTHPVQAAAWGLGRGVALEHPRLWGGLVDLPQVFDERAGQR 

FT LAGILAVKDAPDGEDQVALRATGVSGRRLVRHTVEALPTAAEFTATGTVLITGGTGGLG 

FT AEVARWLARAGAQHLVLTSRRGPDAPGAAELRAELEGYGPSVSWACDVADRDALAAVL 

FT TALPEELPLTGWHTAGVGHYGPLDTLSTAEFAGLTAAKIiAGAAHLDALLADRELDFFV 

FT LFGS I AGVWGSGNQSAYGAANAYLDALALHRRARGLAATSVAWGPWAEAGMAADDAVSE 

FT TLRRQGLGIiLDPAPAMTELRRAWRQDVTVTVADVDWQRYAPLFTSARPSALIAGLPEV 

FT RALAADERTEQDATGASEWTRVRALAEPEQLRLLTDLVRTESATVLGHSSADAVPEGR 

FT AFRDVGFDSLTAVELRKRLGAATGLSLPSTMVFDYPTPLELAQYLRAEILGAVLEVAGP 

FT VATGGADDEPIAIIGMACRFPGGVSSPEQLWDLVASGTDAISEFPVNRGWQTGHLFDPD 

FT PDRPGTTYSTQGGFLHEADEFDPTFFGISPREALVMDPQQRLLLETTWESFERAGIRPE 

FT TLRSTLTGTFVGSSYQEYGLGAGDGTEGHMVTGSSPSVLSGRLSYVFGLEGPAVTVDTA 

FT CSSSLVALHLACQSLRNGESNLAVAGGATIMTTPNPFIAFSRQRALAKDGRCKAFSDDA 

FT DGMTLAEGVGWLVERLSDAQRNGHPVLAVLRGSAINQDGASNGLTAPNGPSQQRVIRQ 

FT ALANARLAPGDIDALEAHGTGTPLGDPIEAQALFATYGRDRDPESALLLGSVKSNIGHT 

FT QSAAGIASVIKMVMALRHSELPPTLHADAPSSHVDWSAGTVRLLTQARAWPETGRPRRA 

FT AVS S FG I SGTNAHVLLEQAP VADTP AEERPAVAPVP I AAG WP WWTARS AAALRGQAE" 

FT RLLAHAETVGTALPAAGPLDIGLSLVSARARFEHRAVWPPAGTDPLAALRAVATDGPS 

FT PWARGVADVEGRTVFVFPGQGSQWVGMGSQLLDESAVFAERIAECAAALAEFTDWSLV 

FT DVLRGWGAPSLERVDWQPASFAVMVSLAALWRSRGVLPDAWGHSQGEIAAAWSGA 

FT LSLRDGARWALRSQAIGRALAGRGGMMSVALSVDVLEPRLVEFEGRVSVAAVNGPRSV 

FT WAGEPEALDALHARLTADDIRARRIAVDYASHSHQVEDLHEELLEVLAELAPRTSEyP 

FT FFSTVTGDWLDTARMDAGYWFRNLRGRVRFADAVADLLAAEYRAFVEVSSHPVLSMAVQ 

FT EAIDEAGVPAVAAGTLRRDQGGTDRFLLSAAEVFVRGVDVDWAGLFEGTGASRIDLPTY 

FT AFQHEHLWAVPPAPEAVAAADPDDT^FWTAVEDGDVSALTAALGTDEDSVAAVLPALTS 

FT WRRARRDRSTVDAWRYRVAWKPLGGTLPHPSLTGTWLLVTADGIDDTDVAGALETYGAE 

FT VRRLVLDEECVDRAVLRERLAGAEDVTGI VSVLAAAERTDAVPGTSLVLGTALTV^ 

FT ALGDAEIDAPVWALTRGAVSTGRADELTAPVQAQVTGIGWTAALEHPQRWGGTLDLPAA 

FT LDARAAQRLAAVLSGALGSDDQLAIRPSGVFTRRIVRAEATAGRPAGTWTPRGTTLVTG 

FT GSGTLAPHLARWLAQRGAEHLVL I SRRGTAAPGAAELVAELAESGTEATVAACDITDRD 

FT AVAALLADLKADGRTVRTVVHTAATIELHTLDATTLADFDRVLHAKVTGAQVL^ 

FT EELDDFVLYSSTAGMWGSGAHAAYVAGNAYLAALAEHRRANGLPALSLSWGIWADDLKL 

FT GRVDPQMIRRSGLEFMDPQLALSGLQRALDDNENVIiAVADVDWETYHPVYTSGRPTPLF 

FT DEVPEVRRLTAAAEQSAGTVAEGEFAAALRALSDAEQQRTLLETVRTEAASVLGLSSAE 

FT DLTDQRAFRDVGFDSLTAVGLRNRLASVTGLTLPSTMVFDYPNPAALAAYLHGELAGAR 

FT SAAAGAAAVPTGAPDADDPIAIVGMSCRYPGGVGSAEDLWRIALDEVDAISGFPADRGW 

FT DAEGLYDPDPDRPGRTYSVQGGFLRDVAEFDPGFFGISPREALSMDPQQRLLLETAWEA 

FT FEHAGIDPVGQRGSRTGTFVGASYQDYASGVPNSEGSEGHMITGTLSSVLSGRVSYLFG 

FT FEGPAVTLDTACSSSLVAMHLACQSLRNGESSLAIiAGGVSIMSTPMSFVGFSRQRALAE 

FT DGRCKAYADGADGMTLAEGVGLVLLERLSDARANGHQVLAVIRGSAVNQDGASNGLTAP 

FT NGPSQQRVIRQALANSAVAPGDIDVLEGHGTGTALGDPIEAQALLATYGQDRAPERPLL 

FT LGSVKSNIGHTQMASGVASVIKLVRALQEGWPKSLHIDRPSTHVDWSSGAIGLLTERT 

FT PWPETGRPRRAAVSSFGISGTNVHTILEQAPADEAPTPADPPRDGLVPVLLSGRGEAAL 

FT RAQAARLLAFVEERPEAHLTDLAHSLATSRAALERRAAVIAADRDTLTRGLRALSDGRP 

FT DPGLVQGTAGRGRTAFLFTGQGSQRPGMGRELHDRYPVFADALDEVLARLDDGPDRPLR 

FT EVLFAAPDSAEAALLDRTGYAQPALFAVEVALFRLLTSWGLTPDYLAGHSVGELAAAHV 

FT AGVLSLDDACTLVAARGRLMQALPEGGAIWALEAAEDEVLPLLEGLTDRVSVAAVNGPR 

FT SVWAGVEEDVLLLADLFAADGRRTKRLRVSHAFHSPLMDAMLDDFAAVARGIiTYHPPT 
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FT IPFVSNVSGGLATAEQVRTPDYWVGHVRAAVRFADGIDWLATQGDVHTFLELGPDGVLS 

FT AMARESLTDPSRTALLPTLRGDRPEEPALVTAVAAAHAHGARVDWSGYFADHGARRTTL 

FT . PTYAFQRERYWPDTTAATSAHTPGSALDAEFWAAVERDDVAALAASLDLDDATVTAMVP 

FT ALTAWRRRRGEQTELDSWRYRVTWKPRGGATAPAALTGRWLVLVPHDHQDRQDDATAAW 

FT AADVETALGTTTVRLTVTTTDRAALAARITEAAGDQGPFSGVLSLLPLATGDAGHPGAP 

FT AALTLTTTAVQALGDAGIDAPLWNVTRGAVAVGRAEQVTAPEQAAVWGLGRAVALELPA 

FT RFGGTLDLPATLDGQAARRLRAVLAATDGEDAVALRPSGVFLRRLAHAPAGPDTARTAF 

FT DPAAGTVLITGGTGGIGGHVARRLARDGATHLLLTSRRGPAAPGADAIiRAELEELGARV 

FT TLAACDAADRDALAALLAELPDDAPLCAVFHTAGVVEDHVVDALTPENFAAVLRAKTVA 

FT AHHLHELTADLDLAAFVLFSSTAGVLGAAGQGNYAAANAHLDALAEHRRSHGLTALSVA 

FT WGPWAGSGMVADAAELTDRVRRGGFEPIiAPEPAVRALLRAIENDDTTVALADIDWERFQ 

FT RAFAAVRPLPFVADLPETGRATPATATGAATGLRQQLAELPEHERPAAVLDLLRTQVAA 

. ' FT VLGHADPRTVEDDHAFRDLGFDSLTILELRNALNAATGLSLPATLVYDLPTPREMADFL 

FT LAELLGTLPTDTAATVASTASPKLSASFEQGGTPFDDPIAVIGIGCRFPGGVTTPEELW 

FT QLLDEGRDGISRFPDDRGWDLAALGAGASDTLEGGFLTGVADFDARFFGISPREALAMD 

FT PQQRLLLETTWEALERAGIDPTTLRGSTTGVFVGTNGQDYPTLLRRSASDVAGYVATGN 

FT TASVMSGRLSYALGLEGPAVTIDTACSSSLVALHWAGRALRAGECDLWAGGVSVMASP 

FT DSFVEFSTQGGLAPDGRCKAFSDAADGTAWSEGVGILVLERLSAARRNGHQVLGLIRGT 

FT AVNQDGASNGLTAPNGLSQQRVIAQALADARLRPADIDAIEAHGTGTTLGDPIEARALI 

FT TAYGRDRDAERPLLLGTVKSNIGHTQAAAGAAGVIKMLMAMRHGTLPRTLHVGTPSSHV 

FT DWSGGTVALLDDARPWPRTGQPRRAGVSAFGVSGTNAHVWEQAPETEAPAAPAAEPAP 

FT EATPTWPWWSGRSREALQAQLDRLTAHTAAHPARSAADVGRSLATDRTLFPHRAVLL 

FT AGPDGVREAARAAAPRTPGRTAFLFSGQGAQHALMGHDLYQRFPVYADALDTVLAQFDT 

FT VLDVPLRAALFAAPGTPEAALLDQTGFTQPALFAVEVALFRLAES WRLTPDFVAGHS IG 

FT EIAAAHVAGVFSLEDACTLVAARASLMQQLPRDGAMVALEATEDEVAPLLTDGVALAAV 

FT NGPRSVWAGAEDAVRAVADRLAADGRRTRRLTVSHAFHSPLMDPMLTDFARVAEGLTY 

FT HEPRIPLVSTLLGAPAGAELRTPDYWVRHVRETVRFADGVRALHDAGAGTFVEIGPDGV ' 

FT LTALTQQTLDTVEAGAPAVWPLQRRDRAGDLALLEGLATLHTHGTGPSWPAYFEATGG 

FT HRTDLPTYAFQRERYWPELGAPVATAPQDPAAWRYHETWAPLPAPEAAAPAGRALVLVP 

FT AGNRDTAWMTAVADALGADTVTAEPDALAEQLTAAGDTPWRVWSLLAAASEGLPADGA 

FT WPAALLATLDEAGVHAPLWCVTRGAVAVAGEAPTAVGQAALWGLGRVAALDHPDRFGGL 

FT ADLPADTDAHAAGLLAAHIiAAPGTEAE I AVRATGVHARRLVRTP AAADG ATWL PTGTVL 

FT WGGTGGTGTMGGRAARWLVREGARHLVLTAPDGTTTAADTEALTAELAALGARITWD 

FT HDPTAPDGFAALLDGLPDDTPLTAWYAPEADAAPGTAAELSAALAPVTALGAALTGRP 

FT LDAFVLFGS I AGLWGVRGRAAEAASGAYLDAFARACRDRGTPALAVAWGAWADLVGPSL 

FT AAHLRMNGLPVMDADTALTALSRAVADGSAAEAVADVRWETFAPLHHEARRTALFDALP 

FT EARGALAEAARDRADRKTAAGDYGRWLAEQPAADHDAILI^VTEKAATVLG 

FT EPDLPFRDLGFDSLTAVDLRNQLTAETGLTLPATLVFDHPNPAALAAHLRAQLLGEASD 

FT SAAPVAAPVALGADDDAIVIVGMACRYPGGVTSPEDLWQLVGDEVDAVGDFPTDRGWDL 

FT AALAGDGPGRSATAQGGFLYDATDFDPGLFGISPREALVMDPQQRILLETSWEALERAG 

FT IDPATLRGSGTTGVFVGGGSGDYRPPEEAGQWQTAQSASLLSGRLAYTFGIQGPTVSVD . 

FT TACSSSLVALHLAAQALRAGECSIALAGGVTVMATPVGFVEFSAQGALSPDGRCRAFSD 

FT DANGTGWSEGVGMLWERLSDARRNGHRVLAVLRGSAINQDGASNGLTAPSGPAQQRVI 

FT RQALANARLRPADIDAVEAHGTGTRLGDPIEAQALLATYGQDRERPVLLGSLKSNIGHT 

FT QAASGVGGVIKMVLAMQHGELPRSLYAENPSSHVDWTAGRAHLLTARTPWPDSGRPRRA 

FT AVSSFGASGTNAHAILEQPPREELPARPADDGAPLPFLLSGRSQNALRAQARRLLARLT 

FT ' AHPDTRAADLAYSLATTRAAFEHRAAITATDHDGLRTGLTAVAEGTTAPHTAEHHLQGT 

FT GKRAVLFSGQGSQRLGMGRELHERHPVFAEAFDSVLARLDDRLDTPLRDWWGTDEEAL 

FT HATGNTQPALFAVEVALYRLIESWGVRPDFVAGHSVGEIiAAAHVAGVLSLDDACRLVAA 

FT RAALMQRLPAGGAMIAVEATEDEVTPLLTDGVSLAAVNGPTAWLSGAGDAVTALGQAL 

FT AERGHRTTRLRVSHAFHSHLMDPMLADFRTVAEGLEYHPPRIPWSNLTGDVADAADLC 

FT SADYWVRHVRGTVRFADGVRTMADRGVHLFLELGPDAVLSAMARQCAPDAVWPALRRN 

FT ' RDEDETLVGAVARLHVHGAGPRWDAYFAGRGAQWLDLPTYPFQRGRFWPESLPGAASAA 

FT PAAGQPAETDAAFWDAVAQEDFTALESVLDVESDALSKVLPALMDWRSRQADESQLAGW 

FT RHRIVWKRLTGAALAHRKALSGTWIJIWPEGFADDPWVTTTLDGLGTHLVHLEVAEA^ 

FT AALADAIAARTADGTRFGGVISLLALREELTGAVPEGTALTTTLLQALGDAGVDAPLWC 

FT VTRSAVSAGRTDRPHRPLQGAVWGLGRVAALEYPQRWGGLVDLPEEPDERSAAGLAAVL 

FT AGLDGEDQVAVRGTAVLARRLVPAPGRKPSRPWHPSGTVLVTGGTGALGAHVARRIAKD 

FT GAQHLVLLSRRGPDAPGAAELRAELDALGTDVTVAACDVADRDQLTAVLDALPADRPLT 

FT GVVHTAGVLpDGVLDRLTPERFQEVFRAKVTSALLLDELTRDRELAAFVLFSSASAAVG 
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FT NPGQANYAAANAVLDALAEQRRVLGLPATSVSWGAWGGGGMADADGADEAARRAGVGAM 

FT DPHLAVEALLRLVAEKEPTAWAEVALDRFAGAFGGSRPSALLREFPGYREALAAQAEQ 

FT • AADGGGLAARLAALPPARRLDTVVDLVRTRAAQVLGYPDTEAVAAERSFRDLGVDSLGA 

FT VELRNQLSAATGLNLPATLVFDHPTPLVLGEHILGGLFPDEPAGSDDETEIRALLASVP 

FT LDQLREIGVLEPLLQLAGRGGRAADGDDGESVDSMTVADLVRAALNGQSDL " 

FT CDS 34384.. 50691 

FT /codon_start=l 

FT /db_xref="SPTREMBL:Q9L4X2" . 

FT /note="polyketide synthase" 

FT / trans l_table=ll 

FT /gene="nysJ" 

FT /function="responsible for condensation steps 15 to 17 in 

FT the nystatin polyketide backbone synthesis" 

FT /products "Nys J" 

FT /protein_id="AAF71767 .1" 

FT /translation="MNAPENPETPENNWAALRAAVKETDRLRRQNRMLVAAAKEPIAV 

FT VGMACRFPGAVDSPEALWEMVATGTDVISGFPDDRGWDLEALRNSGTDARDTDVSQRGG 

FT FLDCIADFDPGFFGISPREAVTMDPQQRLLLTTAWEAVERAGIDATTLRATRTGAFIGT 

FT NGQDYAYLLVRSLDDATGDVGTGIAASAASGRLSYTLGLEGPALTVDTACSSSLVALHL 

FT AVQALRNGECGMALAGGVNVMATPGSLVEFSRQGGLARDGRCKAFADAADGTGWSEGAG 

FT VLLLERLSDAQRNGHPVLAWRGSAVNQDGASNGFTAPNGPSQQRVIRQALANAGLATG 

FT DIDAVEAHGTGTPLGDPIEAQSILATYGQDRAHPVLLGSIKSNMGHTQAASGVAGVIKM 

FT IMAMRHGVLPRTLHVDRPSTHVDWTTGSVELLTDAHPWPETGRPRRTGISSFGVSGTNA 

FT HVIVEQAPDTPAEAADDTPPRTPRTLPWLLSARTGAALRDQATALLDHLDRPDGDRGPT 

FT ALDTAFSLATTRAALEHRLiAWTGTDGTAGRDALTAWLAHGTAPDAHEGHAAGRTRCAA 

FT LFSGQGAQRLGMGRELHARFPVFARALDTAVDLLDAELGGTLREVIWGTDDAPLNETGF 

FT TQPALFAVEVALYRLIESWGVAPDFVAGHSIGEIAAAHVAGVFSLEDACTLVAARAGLM ' 

FT QALPRGGAMVAVEATEDEVS PLLTDGVAIAAINGPTSLWSGDETATLAVAARLAEQGR 

FT RTTRLRVSHAFHSPLMDPMLAEFRAVAEGLSYGEPQIPWSNLTGAVADGTLLGTADYW 

FT VRHVREAVRFADGIRALTDAGVGAFLELGPDGTLiAALAQQSAPDAVSVPVLRKDRDEEP 

FT AAVAALARLHTAGVPVDWTAFYAGTGAHRTDLPTYAFQYERYWPKATYRPADATGLGLT 

FT AADHPLLGAAMSVAGSDELLLTGTLSLATHPWLADHWGGMVFFPGTGFLELAVRAADQ 

FT VGCDRVEELMLAAPLILPATGTVQMQIAVGAADDDGGRDLRFFTRPGDDPDAAWAQHAT 

FT GRITEGERVLALDTTTWPPRDAEPVDIDGLYDRYRANGLDYGPVFRGLRAVWRRDTEIY 

FT AEVALPEGTADADAFGLHPALFDAVLHSTLFASADGDDRSLLPFAWNGVSLHAAGADAL 

FT RVRITSCGPDAVEITAVDPQGRPWSVESLTLRAAGPDAGTADHRADAGSLFRMDWTPR 

FT TVHAPATPATWAVLGTDPIGLTEALTAAGPDTVTGLRDGVDALGELTAGDDRPVPDWA 

FT VPLRGATDHGPAGAHDLTRTVLALLQEWLAEERFARSRLLLVTRGAVADGERGPLDLAA 

FT APVWGLVRSAQSENPGRLLLVDLDDTAESAAQLPLLPALLDADEPQAWREGTVRVGRL 

FT . ARLDSGRGLVPPPGTPWRLGSRAKGSLDGLALLPHPEARRPLTGHEVRVGIRAAGLNFR 

FT DVLNALGMYPGDAGLFGSEAAGVWEVGPEVTGLAPGDRVMGMLFGGFGPLGIADARLL 

FT TPVPADWSWETGASVPLVFLTAYYALKELGGLRAGEKVLVHAGAGGVGMAAIQIARHVG 

FT AEVFATASEGKWDVLRSLGVADDHIASSRTLDFEAAFAEVAGDRGLDWLNALSGEFVD 

FT ASMRLLGDGGRFLEMGKTDIRAADSVPDGLSYHSFDLGMVDPEHIQRMLLDLVEI.FDRG 

FT ALAALPVRSWDVRRAGEAFRFMSLAQHIGKIVLTVPQPLDPDGTVLLTGGTGGLAGLLA 

FT RHLVTEHGARHLLLAGRRGPDAPGAAALHAELTALGAEVTVAACDVADRTALAALLATV 

FT PAEHPLTAVVHTAGVLDDGTLTAIiNPDRIiATVLRPKVDAAWHLHDLTRHLDIjT^FVLYS 

. FT STAGVMGGPGQANYAAGNTFLDALAAHRHALGLPATSLAWGAWEQGAGMTGALTDHDLR 

FT RVSDAGGQPLLTAERGLALYDAATAADEPLIVPLGLTGGALPAGVGVPAVLRGLVRTAG 

FT rraragtagvsraglaerlaalpeeertpflvelvrteaatvlghgstdpvdarrefrq ■ 

FT lgfdsltaielrnrlgkatgltlpatlifdyptpdrlavhlhdellgadapvtvtaaaq 

FT aadpehdpwivgmscrfpggvsspeelwdlvasgtdaitgfpadrawdrhpqlagapg 

FT artgqggflrdiadfdaaffgisprealamdpqqrillevaweaaeragidpqtlrgsd 

FT tgvfmgvsgqdyaglvmrsrddiaghattglavswsgrlayalglegpalsvdtacss 

FT slvslhlaaqalragectmalaggvtvmttaanftgfsrmgglaqdgrckafsdsadgt 

ft gwsegaavlvlerlsdarraghrvlawrgsavnqdgasngltapngpaqqrvirqala 

ft nagltpvdvdaveahgtgtplgdpieaqaliaaygtdrdpehplllgsvksnightqsa 

FT agaaglvkmvmamrhgilpqtlhltepsshvdwsagtvrlltertawprtdrprragvs 

FT S FG I SGTNAHV I LEQPPAEPTPAADPGRPAPTWAWPVSAQTPAALDAQLDRLRTAAAL 

ft APLDTAHTLATGRSLFEHRAVLLATVGDPATGAPDLPEVARGAATPHRTAFLFSGQGAQ 
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FT RSGMGRELHAAFPVFAAAFDEWAVLDAELGSDADGGVSLREVMWGGGSELLDRTRFTQ 

FT PALFAVEVALFRLVASWGVGPEFVAGHSVGEIAAAHVAGVFSLVDACRLWARASLMDA 

FT . LPVGGVMVAVEAAEAEWPLLVDGVAIAAVNGPVSVWSGVEAAVGQWDQLVERGRRV 

FT RRLAVSHAFHSPLMDPMLDAFRAVAEGLEYHQPRIPWSNVTGEVAAAEELCAADYWVR 

FT HVRATVRFADGVRTIiAERGATAFLE IGPDGVLS ALARGVLPAEALVTPTLRKDRDEESA 

FT LLAGLARLHVAGVTVDWSAALTGTGARGTDLPTYAFQRERYWPELAAEPAGGGADAADA 

FT EFWAAVERADATALAAHLDIDGDQLGAVLPALSAWRTRRRTTSATNALRHRESWEPLSL 

FT AGTPHTGGVLVLVPAAATTDPWVADWAALGPDARRVDVPADGTDRAALAALLTEAADD 

FT TAPTAWSLLALDETSGDDAVPAGTTATAALVQALADTGAPAPLWALTRGAVAALPDEQ 

FT PTAPAQAAVWGLGRIAALELPRHWGGLVDLPADLDERTARRLPAALADAGDEDQLALRA 

FT TGAYGRRITPAPAPDDAPGTGWQPTGTVLITGGTGALGRHTARWLAAHGAEHLLLLSRS 

FT GPDAPGAAELTTELTALGARVTLVACDAADREQLTRVLAEVPRDCPLTGWHTAGVLDD 

FT GVLTGLTPDRFATVFRAKVAS AVLLDELTRDTDLAVFALFS S VAGAVGNPGQAGYAAAN 

FT AVLDALAARRRAQGLAGTS I AWGAWAGDGMAARHTRPGAEPVGLLDPDIiAVPALARAVT 

FT EPQPTLVLADLQQPRLLESLLALRPSPLLSRLPAARTAARAVQEADRRRAGAAADLRDQ 

FT LAGTAPADRHAVLLRLVRTTAAAVLGHTGADAIRADKPFRDLGFDSLTAVELSSALAAA 

FT TGIiALPPSLVFDHPSPRAIjADHLRAELTGDRPESAPAAPPAPVPAADDDPI\AA^GMACR 

FT FPGGVTTPEEFWQLLAEGRDGIDAFPTDRGWDLDVLGRRRPGPQRPPRSAASSYDAAAF 

FT DPGFFDISPREALAMDPQQRLLLETAWEAVERTGTDPTRLRGSRTGVFVGTNGQDYAGL 

FT VLRAQEDVEGHAGTGLAASVISGRLAYAFGFEGPAVTVDTACSSSLVALHWAVQALRAG 

FT ECSLALAGGVTVMTTSTSFAGFTRQGGLAPDGHCKAFSDSADGTGWSEGVGVLVVERRS 

FT DALRNGHE ILAWRGS AVNQDGASNGLTAPNGPAQQRVIRQALANAGLAPGDVDAVEAH 

FT GTGTVLGDPIEAQALLATYGQDRPADRPLWLGSVKSNIGHTQAAAGAAGLMKMVLALQH 

FT GTLPRTLHVTEPSTRVDWSAGAVRLIiTERTVWPRTDRPRRAGVSSFGISGTNAHVILEQ 

FT PPAEPTPTAPADRPTRTPAVLPWWSARSATALDAQLARLRAFAAERPDLPPADVAHSL 

FT VTSRATFEHRAVLLAAPDGITAAARAEARERSTAFLFSGQGAQRSGMGRELHAAFPVFA 

FT AAFDEWAVLDAELATGSGGGVSLREVMWGGGSELLDRTRFTQPALFAVEVALFRLVAS ■ 

FT WGVGPEFVAGHSVGEIAAAYVAGVFSLVDACRLWARASLMDALPVGGVMVAVEAAEAE 

FT WPLLVDGVAI AAVNGPVS VWSGVEAAVGQWDQLVERGRRVRRLAVSHAFHS PLMDP 

FT MLDAFRAVAEGLEYHQPRIPWSNVTGEVAAAEELCAADYWVRHVRATVRFADGVRTLA 

FT ERGATAFLEIGPDGVLSALAAACLFDTDAEWPALRKGRPEEHTALTAAAQLHVAGVDI 

FT DWTAVLAGTGGRRIALPTYAFQRERYWPSLAAQAPGDAGGLGLEAGRHPLLGAATTVAG 

FT SAEILLTGRLSTTAQPWLAVYEADGRTVLPAAVLAELAVRAGDQADCPTVAELTVAAPL 

FT VLTGAAAQRLQVRVAAPDDTGRRALS VHARPDDS PDS PWTLHATAVLTHDTPQPPAPDT 

FT GWPPERAVPLDALPTATGPARIAAAWQWGDELCAEIELPEPGPAERAFALHPALLDTAV 

FT RAGGLLDGDATLDALGWRGLALHAASATALRVRLTPDGTDTWALEATDPQGAPWSVTG 

FT LTLGTPTVDRSGAGAADDGATLLDLEWVPAPQAAPTGGDHLPYAVLGDQLAELDGQLRI 

FT AGDGPGRVASLAALLDGGAPLPRLVLAPVLGVPTGEGDLPAAVRGTTTAVLELLQRWTA 

FT DARTADSHLVIVTRGAVAAGAEDVHDLAAAPVWGLVRSAQSEHPGSFLLLDLDPADPAG 

FT ASRAAAPATLAALLDAGETQAAVRADTLTVARLTRAADGPEATAGHPVRDWDRDGTVLI 

FT TGGTGGLGGLLARHLVTGHGIKHLLLAGRRGPDAPGARALRDELAALGAEVTVAACDVA 

FT DRAALDRLLAQLPPEHPLTAWHTAGVLDDATVGTLTPERLDTVLRAKADAAWHLHDAT 

FT RDRDLAGFVLYSSVAGVTGGPGQGNYAAGNTFLDALAAHRAAQGLPGLSLAWGPWGQDA 

FT GMTGTLGAADLARLERSGMPPLTPEQGLALFDAAGARGDGFAVAVRLARGAAAPGADEV 

FT PAVLRALVRGRRRTAAAAGHAGVLARRLAALDAEQRHQALLDLVRTETAAVLGHSGADA 

FT VPAERDFNRLGFDSLMAVELRTRLATATGARLPATLVFDHPTPDAVARHLASTLPGGTA 

FT AGPDRS PLAELDR I AAELS PEGADDATRQGWGRLRHLLAQWDGTRQDGGGTTVDDRIE 

FT AASAEEVLAFIDHELGRQADS " 

FT CDS 50747.. 56947 

FT /codon_start=l 

FT /db_xref="SPTREMBL:Q9L4Xl" 

FT /note="polyketide synthase/ thioesterase; contains a 

FT C-terminal thioesterase domain" 

FT /transl_table=ll 

FT /gene="nysK" 

FT /functions "responsible for the last condensation step in 

FT the nystatin polyketide backbone synthesis" 

FT /product="NysK" 

FT /protein_id="AAF7l768.1" 

FT /translat ion= "MPDEKKLVDYLKWVTKDLHQTRQRLQEVEAGRHEPVAIVGMACRF 
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FT PGGVRSPEDLWELLSAGRDGIGPFPADRGWDIiAALAGDGPGRSATQEGGFLPDAAAFDP 

FT GFFDISPREALAMDPQQRLLLETAWEAVERSGIDPAGLRGSRTGVFVGTNGQDYAHLVL 

FT . AAQDDMGGYAGNGLAASVLSGRLAFALGLEGPAVTLDTACSSSLVTLHLAAQAVRAGEC 

FT GLALAGGVTVMTTSSSFAGFSLQGGLAPDGRCKAFAEAADGTGWSEGIGLLLVERLSDA 

FT QRNGHPVLAVLRGSAVNQDGASNGLSAPNGPSQQRVIRQALAGAGLVPGDVDAVEAHGT 

FT GTRLGDPIEAGALLATYGQDRPADRPLWLGSVKSNLGHTQAAAGVAGVIKMVLALRHGV 

FT LPQTLHVDAPSSHVDWESGAVRLLTAPVAWSEGDDRVRRAGVSSFGISGTNAHVILEQA 

FT PDQPEPTAEETAAAAPGGTAEERAAAPVAPRAVPWPVAARTAGALDAQLVRVRALTTAP 

FT GRTAADVGHALATARTPFEHRALLVHEGGAVTEVARGAVPTGDRGGLAVLFSGQGSQRP 

FT GMGRELHARYPVFAAAFDETVALLDARLGTSLRDIVWDQDRTRLDDTRHTQPALFAVEV 

FT ALYRLLASWGIRPDHVTGHS IGE ITAAHVAGVLTLADACTLVAARATAMSELPPGGAMV 

FT ALEATEDEVRPLLTDDLAIAAVNAPRSVWAGAEDAALAVRRHFDDLGRRTTRLPVSHA 

FT FHSPL.MDPMLDAFRTALAPLTFAEPEIPWSNLTGLPATAEELATPHYWVCHVRQAVRF 

FT GDGVRALADRGVRTFLELGPDGVLSALVRENLPEPGLVAVPVLRKERPEETTVLAALGT 

FT LWAHGADVDWDAVFAGTRTPQADPVELPTYAFQRARYWPTLGARHGDPADLGQTAAAHP 

FT LLGAAVTLADADETVLTGRLALPSHPWLGDHRSDGRITVPGVAFAELAVRAGDLSGTPH 

FT LARLDLPAPLTLGDGDTVTLQVRVGAPDPAGHRPLTVHARLAATEDAPWTTCATGLLAP 

FT DAPEAPADPIGPADAGWPPRDARPVPVADLDAAATAAGRHYGPHFQGLTGLWRRDGEVF 

FT AEVALPTATAADRAFGIHPALLATALRATAALDDDHTAGHTPEPTGITGLALHATGATA 

FT LRVRLTATGPDTVALAAADATGGAVLTADTVTLGSPQDRPAPAPAGHTGQGGLFHLDWV 

FT PVDPGSRATGTRWAWGDDELDLGYAIiHRADETVSAYAASLGGAIGDSGLAPDVFLVPV 

FT VGGPDAGPDAVHAVTARALGLLQEWLNEPRIiAGARLVFVTRGAVAVPGETVTDPAGAAV 

FT . WGLLRSAQTENPGSLLLVDLDDAFRSAGMLPHVLTLDEQQLVVRDHAVRAARLARLPEP 

FT AAGTAPARAWDPDGTVLITGGTGGLGAALARHLVTVRGARHLLLAGRRGPEAPGAGELV 

FT AELTAQGADVRVAACDVGDRTALDALLATVPAAHPLTAWHTAGVLDDALIGSLTPDQL 

FT ATVLRPKADAAWHLHDATRGLDLAGFVLYS S VSGVLGS PGQGNYAAANAYLDALARHRA 

FT DQGLPALSLAWGPWGRGSGMTASVSDADLERMARGGLPPLTVEDGLALFDAAVGRPEPA ' 

FT LVPSRINVAGLRDQQALPALWRDLVPRARRTAATADRSPVTVRERLRHLDETGQEQLLI 

FT DLWGYTAGLLGHPDPTAVDPERGFLELGFDSLVSVGLRNQLAEILGLRLPSS I VFDSK 

FT S PVKIiARWLHQELANGPQPGATGPAAADARPAVRSDDTLEGLFYNAVRGGKLVEAMRML 

FT KAVANTRPMFDTPAELEELSEPVTLADGPGRPRLIFVSAPGATGGVHQYARIAAHFRGS 

FT RHVSALPLMGFAPGELLPATSEAAARIVAESVLMASEGEPFVMVGHSTGGSLAYLAAGV 

FT LEDTWDVRPEAWLLDTAS IRYNPGEGNDLDRTTRFYLADIDSPSVTLNSARMSAMAHW 

FT FMAMTDIQAPAPTAPTLLVRAARALDGFRLDTSSVPADEVRDIDADHLSLAKEHSALTA 

FT QAIEGWLAELPDPAA" 

FT CDS 57095.. 58279 

FT /codon_start=l 

FT /db_xref = "SPTREMBL : Q9L4X0 " 

FT /note="P450 monooxygenase " 

FT /transl_table=ll 

FT /gene="nyslj" 

FT /function="presumably involved in modification of the 

FT nystatin macrolactone ring" 

FT /products "NysL" 

FT /protein_id= " AAF7 1769.1" 

FT /translation="MSTPTAPPSLKAEVPPVLRLSPLLRELQSRAPVCKVRTPAGDEGW 

FT LVTRHTELKQLLHDDRLARAHADPANAPRYVHNPFLDLLVVDDFDIjARTLHAEMRSLFT 

FT PQFSARRVMDLTPRVEALAEGVLAHFVAQGPPADLHNDFSLPFSLSVLCALIGVPA^ 

FT GKLIAALTKLGELDDPARVQEGQDELFGLLSGLARRKRITPEDDVISRLCLKVPSDERI 

FT GPIASGLLFAGIiDSVASHIDLGTVLFIQHPDQLAAALADEKLMRGAVEEILRSAKAGGS 

FT VLPRYATADVPIGDVTIRAGDLVLLDFTLVNFDRTVFDEPELFDIRRAPNPHLTFGHGM 

FT WHCIGAPLARVNLRTAYTLLFTRLPGLRLVRPVEELRVLSGQLSAGLTELPVTW " 

FT CDS complement (58378. .58572) 

FT /codon_start=l 

FT /db_xref = "SPTREMBL : Q9MW9" 

FT /note= " f erredoxin" 

FT / 1 rans l_t able = 1 1 

FT /gene="nysM" 

FT /function="participates in electron transfer in P450 

FT monooxygenase systems" 
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FT /product="NysM" 

FT /protein_id="AAF71770 .1" 

FT / translation= "MRITVDPGRCVGAGQCVLTAPDLFDQDDDGLVTVLAGAADAADPG 

FT DVRDAAALCPSGAI S VAAD " 

FT CDS complement (58637. .59833) 

FT /codon_start=l 

FT /db_xref ="SPTREMBL:Q9L4W8" 

FT /note="P450 monooxygenase" 

FT /transl_table=ll 

FT /gene="nysN" 

FT /function= "presumably involved in modification of the 

FT nystatin macrolactone ring" 

FT /product = "NysN" 

FT /protein_id="AAF71771.1" 

FT /translation="MSTEADARTAAPQCPVAFPLRRPGRPFPPPEYATYRGGAGLVRSE 

FT LPSGPVWLVTRHEDVRAVLTDPRISADPSKPGFPKAGRTGGAPSQYEVPGWFVAMDPPE 

FT HGRFRKTLIPEFTVRKVRELRPVIQQIVDERIDAMLAAGTSADLVESFALPVPSLVISS 

FT LLGVPKVDRDFFEDRTRVLVRLSSTDEERDKATQALLRYLGRLIQIKQRRPGDDLISRL 

FT lAAGTLSRQELSGVAMLLLIAGHETTANNIGLGWQLLTNPRWIGDDRIVEELLRYYSV 

FT ADLVAFRVAVEDVEIGGQLIRAGEGIVPLIAAANHDATAFAAPSEFDPERSARSHVAFG 

FT YGVHQCLGQNLVREEMDIAYRTLFARIPSLTIiAVPVEELPLKYDGVLFGLHELPVTWK" 

FT CDS complement (59830. .60888) 

FT /codon_start=l 

FT /db_xref = " SPTREMBL : Q9L4W7 " 

FT /note= "putative aminotransferase" 

FT / trans l_table=ll 

FT /gene="nysDII" ■ 

FT /functions "presumably involved in mycosamine biosynthesis" . 

FT /products "NysDII" 

FT /protein_id=:"AAF71772 .1" 

FT / 1 r an s 1 a t i on= " MS FTYP VSMPWLQGRELD YVTEAVGGGW I S SQGP YVRRFEE AF AA 

FT YNDVPFGVACSSGTTALTLiALRALGVGPGDEVIVPEFTMIASAWAVTYTGATPVFVDCG 

FT DDLNIDVSRIEEKITPRTKVIMPVHIYGRQCDMDAVLNLAYEYNLRWEDSAEAHGVRP 

FT . RGDIACFSLFANKIISAGEGGVCLTHDPHLAEQMAHLRAMAFTKDHSFLHKIOjAYNFRM 

FT TNMQAAVALAQTEQLDTILALRRD lEKRYDEALRD I PG ITLMPPRDVLWMYDLRAERRD 

FT ELCAYLAGEGIETRVFFKPMSRQPGYFSADWPALNAARLSADGFYLPTHTGLTAQEQEF 

FT ITGRIRAFYGVA" 

FT CDS complement (60909. .62429) 

FT /codon_start=l 

FT /db_xref= " SPTREMBL : Q9L4W6 " 

FT /note= "putative glycosyltransf erase" 

FT /transl_table=ll 

FT / gene = " ny sD I " 

FT /functions "presumably responsible for attachment of the 

FT dGDP -mycosamine to the nystatin macrolactone ring" 

FT /products "NysDI " 

FT /protein_id="AAF71773 .1" 

FT / trans la t ion= " MTLPSGNTRLGWRRRRMHSPGDRAGRVRGARARRPATFRGVLSMG 

FT ANRRPILFVSYAESGLLNPLLVIiAGELSRRDVADLWFATDEKARDEVAAWDGSPVRFA 

FT SLGDTVSQMSAVTWDDATYAEVTQRSRFKAHAAVIRHSFAPESRMAKYRRLEEIVEEVE 

FT PALMVIESMCQFGYELAITKGIPFVLGVPFVPSNVLTSHVPFAKSYTPSGFPVPHSGLP 

FT AAMSLAQRIENQLFRLRTLGMFLTSDVRKWEEDNRVRTELGIAPQARQMMARIDHAEQ 

FT VLCYSVRELDYPFPMHPKLRLVGTMVPPLPQAPDDDGLSDWLSAQKSWYMGFGTITRL 

FT TREQVASLVEVARRLDGRGHQVLWKLPRGQQELLPPAAELPDNLRIEGWVPSQLDVLAH 

FT PNVKAFFTHAGGNGYHEGLYFGKPLVVRPLWVDCDDQAIRGQDFGVSLTIJDRPETVDTE 

FT DVLDKITRVLDQPSFTERAEHFAGLLRDAGGRAAAADLLLGLPALATD " 

FT CDS 62659.. 66759 

FT /codon_start=l 

FT /db_xref s"SPTREMBL:Q9L4W5" 

FT /notes "polyketide synthase; contains KS domain with 
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FT Cys->Ser substitution in the active site" 

FT /transl_table=ll 

FT . /gene="nysA" 

FT /functions "presumed loading module of the nystatin PKS 

FT complex" 

FT /products "NysA" 

FT /protein_id="AAF71774 . 1" 

FT /translation="MTIGADEDPVVWGMACRYPGGVAGPEDLWELVRTGRDATTAFPD 

FT DRGWDLAALAGDGPGRSATREGGFLTGAADFDAAFFGMS PREAVSTDPQQRLVLETAWE 

FT ALERAGIDPHSLRGSRTGVFVGASGQDYAAVTHASPDDLDGHALTGIiAPGVASGRLAYV 

FT LGLEGPAVTVDTTSSSSLVALHWAVRALRAGECSTALAGGVTVMSTPAAFVGHTRQGGL 

FT APDGRCKPFSDDADGTAWAEGVGIWLEHLSTARAAGNPVLAVLRGSAVNQDGASDGLT 

. FT APSGPAQERVIRAALADARLAPADIDLVEAHGTGTRLGDPVEARALLAAYGQDRDPDRP 

FT LRLGSLKSTLGHAQAAAGIGGVIKTVLTLRHGLMPRIRHLATPTRQVDWSQGAVAPLTD 

FT HTPWPPADRPRRAGVSSFGISGTNAHVILEEAPPADVPVTRPGTLRPSTVPWPVSAATP 

FT EALDAQLARLRAHLRTHSDLDPLDVGYSLATGRAALRHRAVLLPPADGTAADAVEHARG • 

FT AAHQRRTAVLFSGQGSQRPGMGRELAARFPVFADALDDALRALDRHLDGPVREVMWGTD 

FT AALLDRTGWTQPALFAVEVALHRLVASLGVTPDFVGGHSVGEIAAAHVAGVLSLEDACR 

FT LVAARATLMQALPAGGAMAALEATEDEVAPLLGAHIALAAVNGPTAVVVAGAEDAVRQL 

FT TARFADRGRRTSRLAVSHAFHSPLMEPMLDAFRDWSRLTFHQPSIPLVSNLTGELAGS 

FT EITSAEYWVRHVRDTVRFADGITALAKAGADVLIELGPGGVLSAMAWDTLGPDSTTDVV 

FT PALSKGRPEETAFAGALGRLHTLGVPVDWPAFYAGTGARRVELPTYAFQHVRHWPTPPR 

FT PNGAGPGALGHPLLGSAVELADGGGTVCSGALSLRTHPWLADHTVAGRWLPATALLEL 

FT AVRAGDEAGCDVLHELHLTTPPALPDDAALHVQVHVGPADTTGRRAVTVHTRPDHHPAG 

FT DWTRCATGTLGSTPPSAAEAATGGTPAAWPPADAEPLDLADHYERLADRGFDYGPTFRG 

FT LRAAWRRGAE I FADVECPPGTADDAPDHGLHPALLDAARHAAMAVDGTVPVAWHGVRLH 

FT AVGATALRVRIRPTTTGTLTLTAVDVHGAPWTVEALTARPLTDEERAAPRTPRQARGE ' 

FT TPADARPARPAAARPGPAGEPLPDTTGSHPTAGHIiAALPPAARERQLLDLVRTQAAAVL 

FT GHPGPEAVGTRSVFKELGFDSLAGVELADRLTARTGLRLPATLVFNFPTPERAAHRLGE 

FT LLAATAPLDPGAYGEELTRFEAIVTNLPQDGPERRAVADRLDAIVSALRQNSPAEVPSS 

FT DEDIDTVSVDRLLDIIDEEFETT" ' 

FT CDS 66805.. 76383 

FT /codon_start=l 

FT /db_xref = " SPTREMBL : Q9L4W4 " 

FT /note="polyketide synthase" 

FT /transl_table=5ll 

FT /gene="nysB" 

FT /functions "responsible for condensation steps 1 and 2 in 

FT the nystatin polyketide. backbone synthesis" 

FT /product="NysB" 

FT /protein_id="AAF71775 .1" 

FT /translations "MQEPQQGQPDQQEKIVDYLRRVTSDLRRARRRIGELESKDNEPIA 

FT IVGMGCRLPGGVNSPESLWDLVRSGGDAISGFPVDRGWDLETLTGNGDGSSATHEGGFL 

FT YDAAEFDAAFFGISPREATAMDPQQRLLLEVAWEALERAGIAPTALRGSRSGVFVGSYH 

FT WGAPSADAATELHGHALTGTAASVLSGRLAYTLGLEGPAVTVDTACSSSLVALHLAAQS 

FT LRVGESSLAVIGGVTILTEPSVFVEFSAQGGLAPDGRCKAFSDAADGTGWAEGVGVLVA 

FT • ERLSDAQRNGHPVLAVLRGSAVNQDGASNGLTAPNGPSQERVIQQALARTGLTPADIDA 

FT VEAHGTGTRLGDPIEAQALLATYGQGHTPDQPLWLGSLKSNIGHTQAAAGVAGVIKMVM 

FT ALRHGHLPPTLHADAPSSHVDWSAGSVRLLTEGQQWPETGRPRRAAVSSFGISGTNAHA 

FT LLEQAPHPADTADAGDDAAPTEPAGAPAALPWIVSGHSPQALRDQAAALAARVETDPAL 

FT RPQD IGHT LHTARALLERRAVWAPDRAELLAATHELAAGRS ANAVVEGLADVEGRTVF 

FT VFPGQGSQWGMGAQLLDESAVFAERIAECITUUUAEFTDWSLVDVLRGVVGAPSLERVD 

FT • WQPASFAVMVSLAALWGSRGVLPDAWGHSQGEIAAAWSGALSLRDGARWALRSQA 

FT IGRALAGRGGMMSVALSVDVLEPRLVEFEGRVSVAAVNGPRSVVVAGEPEALDALHARL 

FT TADDIRARRIAVDYASHSHQVEDLHEELLEVLAELAPRTSEVPFFSTVTGDWLDTARMD 

FT AGYWFRNLRGRVRFADAVADLLAAEYRAFVEVSSHPVLTMAVLDLIEEAGVTAVATGTL 

FT RRDQGGAGRFLLSAAEVFVRGVDVDWAGAFEGTGAARVDLPTYAFQRERYWNTRTAADR 

FT TPADAPMDAEFWAAVEQADVSALTAALGTDEDSVAAILPGLTSWRRARSQRTTLDSWRY 

FT RVTWTPLAQVPRATLTGTWLLVTTDGIDDTDVAGALESYGAEVRRLVLDEECTDRAVLR 

FT ERIAGAEDVTGIVSVLAAAEDDAARHPGLTRGLALTVSLVQALGDAEATAPLWFLTRGA 
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FT FATGPSDPVTRPLQSQIAGVGWTTALEHPQRWGGTVDLPDTLDARAAQRLAAALSGALG 

FT AEDQLAVRAAGVLARRIVRAGHRAGRPARTWAPRGTTLITGGSGTLAPQI^WI^ 

FT . EHVVLVSRRGADAPGAPELIAEAAESGTEVTVAACDITDRDAVAALLADLTADGRTLRT 

FT VIHAAAAIELSALADTTVAEFADVVHAKVTGARILDELLDDAELDDFVLYSSTAGMWGS 

FT GVHAAYVAGNAYLS ALAEQRRARGLRTTS IHWGKWPDDRARELADPHRIRRSGLEYLDP 

FT ELALTALQHVLDDDETVIGLMDIDWDTYHDVFTAGRPAHLFDQIPEVRRRLDQASVPDP 

FT AGPAADGLAARLHGLAAAEQDRLLLTLVRTEAAAVLGHASAESFPERRAFRDLGFDSVT 

FT AVDLRNRLVAGTGLRLPSTMVFDHPNCAALAAFLKTTALGVPGAAPQQHAATGTPADDD 

FT PIAVIGMSCRYPGGAATPEELLRLALDGADVISEFPADRGWDARGLYDPDPDRPGHTYS 

FT VQGGFLHEAAGFDPGFFGISPREAVAMDPQQRLLLETSWEAFERAGIDPASLRGSAAGT 

FT FFGASYQDYSSTVQNGTGESEAHMVTGTAASVLSGRVSYLLGLEGPAVTVDTACSSSLV 

FT ALHLACQSLRDGESSLALAGGAAVMATPHAFVGFSRQRALAKDGRCKPFSDTADGMTLA 

FT EGVGVVLLERLSHARANGHRVLAVIRGSAVNQDGASNGLTAPNGPSQQRVIRQALANAG 

FT LTGADVDAVEAHGTGTKLGDPIEAQALL.ATYGQDRDAERPLLLGSVKSNIGHTQAAAGV 

FT AGVIKMVLAMDAGELPGTLHLDAPSSHVDWTAGAVELLRGRTPWPESGRPRRAGVSSFG 

FT ISGTNAHLILEQAPATEPPADPDRLRDTATDTWPWPLAAKSPAALRAQAARLLATVEH 

FT DPDLPPAPVGHALATTRAALEHRAVWGERREDFLRGLAALSTGASTAGLVSGIAGPDP 

FT EGAVFVFPGQGSQWWGMGRELLATSEVFRTAIDDCATALAPYVDWSLHDVLAGEGDPAL 

FT LERVDWQPALFAMMVGLSALWRSHGWPAAWGHSQGE I AAACVAGALSLADAARWA 

FT LRSQALPQLSGRGGMMSVSAPVERVTALLAPWQEALSVAAVNGPSSVWSGDTDALDAL 

FT HTACQEQGVRARKVSVDYASHGRHVEAVRDEI^VLAPVDPRAPEVPFYSTVTGD 

FT AAFDGAYWYTNLRQTVRMEE ATRALLAAGHRVFI EVS PHPVLAAP IQETQEAVAEATGG 

FT SAWLGSLRRDEGGPRRFLTSLAEAHTHGAPVDWTTTFARSAYQPVDLPTYPFQRQDFW 

FT PEARPATPAAGADASDAAFWQLVENQDLAALADALGVPADDEHTALGTVLPALSAWRAK 

FT AQARTRIDELRYHVQWTRVAEPAAAPTTGRLLVAVPPDHADAPWVAAALDALGTDTVRF 

FT EAKGTDRAGWAAQIAQLVEDGEEFTGWSLLAAAEDLHPDFGSVPLGLGQTLVLVQALG 

FT DAGLTAPLWCLTRGAVATGRDDALDSPTQGALWGLGRWALEHPDRWGGLIDLPATLDA • 

FT ' RAAARLTGLLADPAGEDQLAVRATGVLARRMVHAAPSAPRTGRRWRGRGTCLITGGTGG 

FT IGGRVARWMAEHGAAHLVLTSRRGPDAPGAAALRAELEALGARVTLAACDVADRDAIAA 

FT LLADLPADQPLTSVFHSAGVADGDARAADLTLDQLDALLRAKLTAAHHLHELTAPLDLD 

FT AFVLFS SGAAVWGSGGQPGYAAANAYLDALAAHRRSLDLPGAS VAWGTWGEVGMATVPE 

FT VHERLHRQGVRAMEPDHAIGALQQMLEDDDTTLAVTLMDWEAFAPSFTATRPSALFSTV 

FT PEAVRAVTGDPGTTAGDDVDSATPPLRRHLEELSAAERGRALVEAVRAEASATLGHDTP 

FT DAI PAGRAFRDVGFDS VTAVELRNRLRTALGLPLPAALVFDHPTPTALAGHLGALLFGT 

FT APEDAGTGRPDDPDARIREALATVPIGRLRKAGLLDMVLKLADGDATDAPAPEADAPSE 

FT SLDDMDAEALLRLATENSAN" 

FT CDS 76403. .109693 

FT /codon_start=l 

FT /db_xref="SPTREMBL:Q9L4W3" 

FT /note="polyketide synthase" 

FT /transl_table=ll 

FT /gene="nysC" 

FT /function="responsible for condensation steps 3 to 8 in the 

FT nystatin polyketide backbone synthesis" 

FT /product="NysC" 

FT /prot ein_id= " AAF7 1776.1" 

FT /translation="MSTNPDKYVEALRSSLKEIERLRRQNEQLVAAAVEPVAWGIGCR 

FT FPGGVTSPEDLWELVAEGRDVIGPFPQDRGWDLEKLAGGGEGGSLAQVGGFVEDAAGFD 

FT PGFFGISPREAVAMDPQQRILLEITWEALERAGIDPSTLRGTPTGVFVGTTGQDYGEVI 

FT KASAEDVEVYSTTGHAASVISGRLSYTLGAEGPAVTVDTGCSSSLVALHWAVQALRGGE ' 

FT CSMALAGGAS IMATPGPFVAFTAQSGLAADGRCKPFSDRADGTGWGEGAGMLVLMRLSD 

FT AQREGRPVLAVLRGSAINQDGASNGLTAPNGPSQQRVIRAALDSAHLTAADIDAVEAHG 

FT TGTTLGDPIEAQALLATYGQDRPRPLWLGSVKSNIGHTQAASGAAGVIKMIMALQRGVL 

FT PRSLHATEPTTDVDWTAGS VDLLDETVAWPETGRARRAGVS S FG I SGTNAHVI LEQAPT 

FT APEEPTTEPTVRPAWPWALSARTAAALDAQRARLTGHLADTPDADPLDVGYALiADGRA 

FT TFEHRAVLLPDGTELAHGTAGEGPCAVLFSGQGSQRPGMGRELHARFPVFAAAFDEITA 

FT LLDTHLDRPLREWWGTDADLLNDTGWAQPALFAVEVALYRLVASLGVTPDFVGGHS IG 

FT ELAAAHVAGVLSLEDACTLVAARARLMQALPRGGAMLAIRATEDEVTPHLTDDVS lAAV 

FT NGPTSVWAGTEEAVAAIGARFTAQDRKTTRI.RVSHAFHSPLMDPMIiAEFRAVAAGLTY 

FT HEPRI PVLSNLTGTVAAVADLCS ADYWVRHVREAVRFADGVTALTDRGVTTLVELGPDG 
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FT VLSAMAQESLPDGAAAVPLLRKDRPEELSAVTGLARAHVRGVTVRWAGLFDGTGARRAD 

FT LPTYPFQHQRFWPTAARAAQDVTAAGLGAADHPLLGATVELADGAGYLFTSRLSVRTHP 

FT . WLADHGVQGRALLPGTAFVELAVRAGDEAGCDRVEELTLAAPLVLPERGGVQLQVRVGA 

FT PDAAGRRTLGIFSRVEDGFDLPWSQHATGVLTAGAGAPDPTFDATVWPPSGAEPVDLTG 

FT AYERLAALGFQYGPAFQGLRAAWRRDTEVYAEVALPDGADTDPAAFGLHPALLDAAQHA 

FT AAYADLGAISRGGLPFAWEGVSLAAAGATTVRARIAPAGEDTVTIAVYDAAGGTVLSVD 

FT SLVSREVPADAPGAAGTVHRDSIiFHVEWTPLQGRPGPAPATVAVLGPDPDALADTLRAT 

FT GIRTTAPRDLAALADAEGPVPDLVVTTLTTTPGAPVPDAAHATTAAVLALAQQWI^ 

FT FADARLVLVTRGATDGTDPAAAAAGGLIRTARTENPGRFALLDLAPDTGRPDPETLATA 

FT LAASHDEPDIJVWGTDVHAARLARVPLATEPTTWNPDGTVLITGGTGGLGAVLARHLVA 

FT THGVRHLLLASRRGPAADGADDLTAELTGLGATVHIAACDVADPAALADLLGTVPAGHP 

FT LTVVVHTAGVVDDGVLGSLTPQRLDTVLRPKADAAWHLHEATRHLDLDAFVLFSSVAAT 

FT LGSPGQANYAAGNAFLDALAARRAATGLPATSLAWGPWTQSVGMTSSLSDLDVERIARS 

FT GMPPLTLEQGTALFDAALAAGPAALAPVRLDLPVLRTQGDIAPLLRGLIRTPVRRTAAQ 

FT VSQTADGLAQRLAGLDAAARREALLELVRTQIAQVLGHADATEVETGRQFQDLGFDSLT 

FT AVELRNALNTATGLRLPATMVFDYPTPHALADHLRDELLGTEAESTTAVPVPTRTAGTD 

FT DPIVIVGMACRYPGGIASPEDLWRLVSQGADATGPFPTNRGWDLDNLYDPDPDRPGRTH 

FT VRAGGFLHDAGSFDADFFGMSPREAMATDSQQRLLLELSWEAVERAGIDPASLRDSGTG 

FT VFAGVMYNDYGTTLTGDEYEAFRGNGSAPSVASGRVSYTLGLEGPAVTVDTACSSSLVA 

FT LHWAAQALRAGECSLALAGGVTVMSTPSTFVEFSRQRGLAPDGRSKAFAEAADGVAWSE 

FT GVGMLVLERQSDAVRNGHE I LAWRGSAVNQDGASNGLTAPNGPSQQRVIRQALASGGL 

FT STADVDAVEAHGTGTTLGDPIEAQALLATYGRDRDPENPLLLGSIKSNIGHTQAAAGVA 

FT GVIKMVMAMRHGVLPQTLHVDAPSSHVDWSVGAVELLTEQTAWPETGRARRAGVSSFGI 

FT SGTNAHWIEQSPTAVPATPASADRSVEEPPAVPWALSGKTPDALRDQAARLLAHVEAH 

FT PALRPVDISYSLIATRTAFDHRAWLGTDRAEALRALTALAAGETDPAALTGTVRTGRT 

FT AFLFSGQGSQRLGMGRVLYERFPAFAEALDTVLTALDAELGHPLRDIIWGEDAQLVDRT 

FT GYTQPALFAIEVALFRLLEAWGITPDFVAGHSIGEIAAAHVAGVLSLGDACRLWARAV ' 

FT LMQSLPEGGAMIAVQATEDEVLPLLTDDVSIAAVNSPTSVWSGYENATLAVARHFADQ 

FT GRRTTRLRVSHAFHSPLMAPMLDDFRAWESLTFTAPTTPWSNLTGELAPAEALCSAD 

FT YWVRHVREAVRFADGIRTLADRGVTTFVELGPDSVLSAMAQESAPEGAGTIPLLRRDRP 

FT EEQAVLAALCHLQVLGVEADWSATFRGLDPVRVDLPTYAFQHRWFWPAARPARPDDVRA 

FT AGLGAAEHPLLGAAVQLPDDDGALFTGRLSLRTHPWLADHTVLGTVLLPGTALVELAVR 

FT AGDETGSGHLEELTLAAPLTLPEDGATLLQVRVGSADDTGRRTVTVHARPDDTADRTWT 

FT LHATGVLATTPPAAAAFDTTVWPPADAEPLTTDDCYAHFTTHRFAYGPAFQGLRAAWRA 

FT GDVLYAEVALPESATDEAAAFGLHPALLDAGLHAALLADDRDTGLPFSWEGVTLHASGA 

FT TALRVRLAPNGPNGLSVTAADPAGNPVATVTRLLARPLDAEQLTIHSALTRDALFHLDW 

FT TPVPLPDTANSAPPALLGPDTAVLADALGDPAVARHATLDDLLAGDTTPPATVLVPLGA 

FT PLDGDTAQHAHALTRSALTLVQQWLATDRLADSRLVFVTHGAVATDDAPPTDLAAAAVW 

FT GLIRSAQTENPGTFTLLDLDTEPDSTTALSRALTLDEPQLLLRAGRARAARLTRTPAPT 

FT TTTHTPWSADGTVLVTGGTGGLGGLVARHLVRSCGVRHLLLTSRSGVGAAGAAGLVAEL 

FT ESLGARWVAACDVGDGSAVAELVAGVSESYPLSAWHAAGVLDDGWGSLTPERLAAV 

FT LRPKVDGAWNLHEATRGLDLDAFWFSSVAGVFGGAGQANYAAGNAFLDALMVHRVAGG . 

FT LPGVSLAWGAWDQGVGMTAGLTERDVRRAAESGMPLLTVDQGVALFDAALATGSAALVP 

FT VRLDLAALRTRGDIAPLLRGLVRAPLRRTAATGLATGADTGLVQRLGRLDHAQRHEALIi 

FT DMVRSSAALVLGHADGNAIDAERAFRDLGFDSLTAVELRNRLRTATGLHLSATMVFDHP 

FT TLSALAEHLRDELFGAVESEVRVPVQALPPTADDPIVWGMACRFPGGVTSPEDLWRLV 

FT DDGTDAITTFPTNRGWDLDNLYDPDPEHFGTSYTRSGGFLHEAGEFDPAFFGMSPREAL 

FT ATDSQQRLLLESSWEAIERAGIDPLTLRGSATGVFAGVMYSDYGSILGGKEFEGFQGQG 

FT SAGSVASGRVSYALGFEGPAVTNnDTACSSSLVALHWAAQALRAGECSLALAGGVTVMST 

FT PSTFVEFSRQRGLAPDGRSKAFAEAADGVGWSEGVGILVLERQSDAVRNGHEILAVIRG 

FT SAVNQDGASNGLTAPNGPSQQRVIRQALASGGLSTADVDAVEAHGTGTTLGDPIEAQAL 

FT LATYGRDRDPENPLWLGSLKSNIGHTQAAAGVAGVIKIWMAMRHGVLPQTLHVDAPSSH 

FT VDWSVGAVELLTEQTAWPETGRVRRAGVSSFGISGTNAHVIVEQPALVESPAAEPSGRE 

FT PGWPLPLSGKSPEALRDQAARLLAGLAERPALRPLDLGYSIiATTRSAFDHRAWLATD 

FT RADAVRALTALAAADADLSAWGDTRTGRHAVLFSGQGSQRLGMGRELYERFPVFAEAL 

FT DVAIDHLDAALPAQASLREVMWGDDVELLDETGWTQPALFAVEVALFRLVESWGVRPDF 

FT VAGHS IGEIAAAHWGVFSLEDACRLVAARATLMQALPTGGAMIAIQAAEDEVTQHLTD 

FT DVSIAAVNGPTSVWSGAESAARTVADRLAENGRKTTRLRVSHAFHSPLMDPMLAEFRA 

FT VAEGLSYATPTLPWSNLTGRLATADDLCSAEYWARHVREAVRFADGVSTLENEGVTTF 

FT LELGPDGVLSAMAQQSLTGDAATVPALRKDRDEETSALTALAHLHTAGLRVDWAAFFAG 
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FT SGATRVDLPTYAFQHATYWPTGTLPTAHAAAVGLTAAEHPLLNGSVELAEGEGVLFTGR 

FT LSLQSHPWLADHAVMGQVLLPGTALLELAFRAGDEAGCDRVEELTLAAPLVLPERGAVQ 

FT . TQVRVGVADDTGRRTVTVHSRPEHATDVSWTQHATGTLTMGSAPADTGFDATAWPPADA 

FT EPLATDDCYARFTTLGFAYGPVFQGLRAAWRAGDVLYAEVAIiAESTGDEATAFGLHPAL 

FT LDAALHASLVAHEGEESNGGLPFSWEGATLYATGATALRVRLTPTGTDGRSVAIAVADT 

FT AGRPVAAIDNLVSRRVSGDQLTGAAGLARDALFTLDWNPVPENLVPENPVPENTGGGHA 

FT QDQDGRPAAATVALVGADGTAIAADLTAAGIHTTLHPDLTTLATTDADVPKTVLIPLTG 

FT TGTGTGTGTESTDGIGTGAAESDASAPSPAEVAHTLSTAALALVQEWTAQERFAGSRLA 

FT FVTTGATAAGGTDVMDVAAAAVWGLVRSAQSEAPDTFVLIDRDPGPAGTHDRTAAAERG 

FT QLLLRALHTDEPQLALRDGGVLAARLARFDTAAALTPPADRAWRLDSTAKGSLNGI^ 

FT PYPAALAPLTGHEVRVEVRAAGLNFRDVLNALGMYPGDDVGSFGSEAAGVWEVGPEVT 

FT GLAPGDQVMGMITGSFGSLAVDDARRLARLPEDWSWETGASVPLVFLTAYYALKELGGL 

FT RAGEKVLVHAGAGGVGMAAIQIARHVGAEVFATASEGKWDVLRSLGVADDHIASSRTLD 

FT FEAAFAEVAGDRGLDWLNSLAGDFVDASMRLLGDGGRFLEMGKTDIRAADSVPDGLSY 

FT QSFDLAWWPETIGTMIJ^LMDLFRTGALRPLPVRTWDVRHAKDAFRFMSMAKHIGKIV 

FT LTLPRSWKPEGTVLVTGGTGGLGGLVARHLVRSCGVRHLLLTSRSGVGAAGAAGLVAEL 

FT ESLGARVWAACDVGDGSAVAELVAGVSESYPLSAWHAAGVLDDGWGSLTPERLAAV 

FT LRPKVDGAWNLHEATRGLDLDAFWFS S VAGVFGGAGQANYAAGNAFLDALMVHRVAGG 

FT LPGVSLAWGAWDQGVGMTAGLTERDVRRAAESGMPLLTVDQGVALFDAALATGSAALVP 

FT VRLDLAALRTRGDIAPLLRGLVKAPIRRAAATTPGDTGLAEQLTRLQRAERRDTLLALV 

FT RDQAAMVLGHTSGDGVDPSRAFRDLGFDSLTAVELRNRIGAATGLRLPATAVFDYPTAD 

FT ALAAHLLTELLGPDAESDPDEPGDPTAGPTDDPIVIIGMSCRFPGDIGSPEDLWRLLGD 

FT GADWTDFPTNRGWDLDNLYDPDPAHAGTSYARTGGFLHDAADFDADFFGMSPREAMAT 

FT DSQQRLLLESSWEAIERAGIDPLTLRDSRTGVFAGVMYSGYGTRLDGAEFEGFQGQGSA 

FT LSVASGRVSYTFGFEGPAMTVDTACSSSLVALHLAAQALRGGECTLALAGGVTVMS IPD 

FT TFIEFSRQRGLAPDGRSKPFSESADGVGWSEGVGMLLLERQSDAVRNGHQILAWRGSA 

FT VNQDGASNGLTAPNGPSQQRVIRQALASGGLSTADVDAVEAHGTGTTLGDPIEAQALLA ' 

FT TYGRDRDPENPLLLGSIKSNLGHTQAAAGVAGVIKMVMAMRHGVLPRSLNITEPSSHVD 

FT WSAGAVELLTEQTAWPETGRARRAGISSFGISGTNAHVILEQPEAARHSAPEEADTAEA 

FT AAKAPATAHLPVMPWALSGKTPEALRAQAARLLAHLQQRPELAPADIALSLATQRSQFT 

FT HRAVVLSTDRDEATRALSALATTAASDPSALTGTVTMGRCAVLFSGQGSQRLGMGRELY 

FT ERFPVFAEALDWIDHLDAALPAQAGLREVMWGDDVELLNETGWTQPALFAIEVALFRL 

FT VESWGVRPDFVAGHS IGEI AAAHWGVFSLEDACRLVAARATLMQALPAGGAMIAVQAT 

FT EDEVI PHLTDEVAI AAVNGPTS WI SGAEEATQTVAQHFADQGRRTTALRVSHAFHS PL 

FT ^WLAEFRAVAEGLSYATPTLPWSNLTGQVATADELCSAEYWVRHVREAVRFADGVTAL 

FT EAEGVRTFLELGPDGVLAAMARETVADDTVTVPVLRRNMPEERTLLTALGRLHTTGTP I 

FT DWAALLAPTGARPVDLPTYAFQHRPFWPSGPRDTADAAAVGIAGASHPLLNGIVELADE 

FT EGLLFTGRLSLQSHPWLADHAVMGQVLLPGTALLELALRAGDEVGCDHVEELTLAAPLV 

FT LPERGAVQTQVRVGVADTTGRRTVTIHSRPARATTTDSDTHTGTDTPWTQHATGVLVAG 

FT LPATATVPFDATVWPPAHAEPVDLADFYASRAGEGFGYGPAFQGLRAAWRRDGEVFADV 

FT ALPEAGRTEAEAYGLHPALLDAGLHAAWLVAPDGEPTRTGSVPFSWRGVFLAASGAS SV 

FT RVRLGRDSDGTLSIAI ADTTGAPVAS VQALSMRTVS VTALS ATAGLARDALFRLDWAS A 

FT PEPACQPDDTVTVIPAVAWGTETSELTSELTAALRAAGADVDVRTTLSTDEPAPALIA 

FT LPLVASDQTGTAEAAPVPAAVHDLTRRALALVQTRLQEQHFADTKFVFVTRGATVGRDV 

FT AAAAVWGLVRSAQSENPGCFALVDLDPDGAVGAAALVAALVSGEPQLAVRGDVLRVARL 

FT VRRPLTEVGAGADGTGDGVGGGSGVSFSGEGAVLVTGGTGGLGAVLARHLVAEYGVRDL 

FT LLVSRSGERAVGAGELVAELAGVGARVRWACDVTDRAAVVELVGGHAVSAVVHAAGVL 

FT DDGMVGALTGERLSAVLRPKVDAVWHLHEATRGLDLDAFWFSSLAGVFGSPGQANYAA 

FT ANAFLDAIiMTRRRAEGLPGLSLAWGPWSLTDGTSGMLADAEADRLTRSGVPPLTAEQGL 

FT ALFDAAIiATGDATCVPVRLDLSALRAQGEVPPLLRSLIRGRSRRAAAAESATATGLRER 

FT LVGLNPVERQEVLLDLVRGQVALVLGHADADDVHPARAFRELGFDSLTSVELRNRLNTV 

FT TGLRLPATMVFDYPTVEVLVSYVLDELLGTDAEVATVQPAAVAVADDPIVIVGMACRYP 

FT GGVASPDDLWRLVTDGVDAVSPFPTNRGWDVESLYHPDPDHLGTSYTRSGGFLHEAGEF 

FT DPGFFGMSPREALATDSQQRLLLESSWEAIERAGIDPVSLRGSRTGVFAGVMYSDYSAM 

FT LASPEFEGFQGSGSSPSLASGRVAYTLGLEGPAVTVDTACSSSLVAMHWAMQALRSGEC 

FT GIJUiAGGVTVMSTPAVFVDFARQRGLSPDGRCKAFADAADGVGWSEGVGVLVLERQSDA 

FT VRNGHEILAWRGSAVNQDGASNGLTAPNGPSQQRVIRQALASGGLTAGDVDWEAHGT 

FT GTTLGDP I EAQALLATYGRDREPERPLLLGS VKSNLGHTQAAAGVAGVI KMVLAMRHGV 

FT VPRTLHVDAPSSHVDWSEGAVELLSEQAAWPETGRVRRAGVSSFGISGTNAHVILERPE 

FT AARRPVMETNTVEPSTVPWVLSGKTPEALRAQAAKLLSSIEERPELRLVDVGMSLVTGR 
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FT STFEHRAWLAADRADAARALSAIAADEADAAAATGRVGAGRHAVLFSGQGAQRLGMGR 

FT ELYERFPVFAEALDVWDHLDAALPAQAGLREVMWGDDAELLNETGWTQPALFAIEVAL 

FT . FRLVESWGVRPDFVAGHS IGEIAAAHVAGVFSLEDACRLVAARATLMQALPAGGAMIAV 

FT QATEDEVTPHLTDDVAIAAINGPNALWSGVEDAAVEIGARFAAEGRRTTRLHVSHAFH 

FT SPLMDPMLAEFRWAEGLSYAAPSLPWSNLTGQVATADELCSAEYWVRHVREAVRFAD 

FT GVTALEAEGVRTFLELGPDGVLAAMAGASLTESSLAVPLLRKDRPEEPAALAALAQLHI 

FT AGARVDWPVLFAGVGAGRVELPTYAFQRGWFWPVGRVGVGGDVGAVGLGSAGHPLLGAA 

FT VELAAGAGWLTGRLSLSSHGWLADHAVMGRVFVPGTALLEMVMRAGDEVGCGRVEELT 

FT LAAPLVLPERGGVRVQVAVDAPDAAGRRGVGVYSCPDGVGQAVWSQHAVGVLASGVADQ 

FT VGGFGDGGVWPPQGAVSVDAEGCYELFADAGFGYGPVFQGLRAVWRRGEELFAEVALSD 

FT EVAESADTATGFGLHPALLDASLHASLLSSLEGQSADGGPALPFAWEGVSLFASGATAL 

FT RVRLAPAGEHAVSVTAVDPTGAPVISIDALRTRRLTLDEVNASHTQLSDALFGVQWTTV 

FT PSTPAADHPSVAIIGTDHLGLAEALSSSSAGATTTTTAAAYESLDALIAAGPEVSVPDV 

FT TLIGLTTEDAIAQYVNDHDATVAGQGTIGAGAAAVDAARRLTAEALRTIQAWLADERLA 

FT ARRLVFVTRGAADGQDVAAAAVQGLVRSAQTENPGTFGLLDLDGTEASTAVLGEALTSD 

FT EPQLLLRDGHLHAARLTRLASPADTAVPTEWNADGTVLITGGTGGLGAQFARHLVDRYG 

FT VRNLLLVSRRGPDAPGTTELVAELTAHGAEVAVQACDVADGDAVAALVAGVPDEHPLRA 

FT WHTAGVLDDGVIGSLTEERLATVLRPKADAAWHLHEATRGLDLDAFWFSSVAGVFGG 

FT AGQANYAAANAFLDALMAQRRAAGLPGLSIiAWGPWDQTGGMTGMLSDAEADRLARSGIP 

FT PLSAEQGLALFDAAIiALAGTSTPDRAAGSAAASTSGTGDTIAIPAAALVAPVRLDLAAL 

FT AAQGEVPAILRGLVRTRTRRTAAGGSVTVAGLVNRLSGLTADERRQELLELVRTQAALV 

FT LGHADPASVDSTAQFRDLGFDSLTAVELRNRLSTATGLRLTATLVFDYPNTDALAEHLR 

FT DELFGAVESEVRVPVQALPPTADDPIVWGMACRFPGGVTSPEDLWRLVDAGTDAITTF 

FT PTNRGWDLESLYDPDPAHLGTSYTRSGGFLHEAGEFDPAFFGMSPREALATDSQQRLLL 

FT ESSWEAI ERAGIDPLTLRGSATGVFAGVMYSDYGS ILGGKEFEGFQGQGSAGSVASGRV 

FT SYTLGFEGPAVTVDTACSSSLVALHIJ^QALRAGECTIiALAGGVTVMSTPGTFVEFSRQ 

FT RGLAPDGRSKAFAEAADGVGWSEGVGILVLERQSDAVRNGHEILAVIRGSAVNQDGASN - 

FT GLTAPNGPSQQRVIRQALASGGLSTADVDAVEAHGTGTTLGDPIEAQALLATYGRDRDP. 

FT ENPLLLGS IKSNLGHTQAAAGVAGVIKMVMAMRHGVLPQTLHVDAPSSHVDWSVGAVEL 

FT LTEQTVWPETGRVRRAGVSSFGISGTNAHVILEQPEAVQRLAPGAAETVEPVAIKPSAE 

FT PSLVPWALSGKSPEALRAQAARLRETFLAERPEPRSIDIGHSLAVTRSQFDHRAIVLVDD 

FT AKAPADSLAALAALASGVADPAWSDAVSTGGSAVLFTGQGAQRLGMGRELYGRFPVFA 

FT EALDVWDHLDAALPAQAGLREVMWGDDVELLNETGWTQPALFAVEVALFRLVERWGVR 

FT PDFVAGHSIGEIAAAHVAGVFSLEDACRLVAARATLMQALPTGGAMIAVQATEDEVTPH 

FT LTDEVAIAAVNGPTSWISGAEEATQTVAQHFADQGRRTTALRVSHAFHSPLMDPMLAE 

FT FRAVAEGLSYATPSLPWSNLTGWLATADELCSAEYWVRHVREAVRFADGITTLEAEGV 

FT RTFLELGPDGILSALAQQSLAGEAVTVPVLRKDRGEESTALTARAHLHTRGLIEDWQDF 

FT FAGVGAGRVELPTYAFQRGWFWPVGRVGVGGDVGAVGLGSAGHPLLGAAVELAAGAGW 

FT LTGRLSLSSHGWLADHAVMGRVFAPGTALLEMVMRAGDEVGCGRVEELTLAAPLVLPER 

FT GGVRVQVAVDAPDAAGRRGVGVYSCPDGVGQAVWSQHAVGVLASGAADQVGGFGDGGVW 

FT PPQGAVSVDAEGCYELPADAGFGYGPVFQGLRAVWRRGEELFAEVALSDEVAESADTAT 

FT GFGLHPALLDASLHASLLSSLEGQSADGGPALPFAWEGVSLFASGATALRVRLAPAGEH 

FT AVSVTAVDPTGAPVISIDALRTRRLTLDEVNASHTQLSDALFGVQWTTVPSTPAADHPS 

FT VAI IGTDPFGLADGLSDALPLVEERGDLAALAASEHPVPDLVLVPVAGTRRTGVPADAE 

FT GHTDAGTSDMLRSVREATAQVLEQIQQWLADDRFEAARLVFVTRGAVSVGEGGIADLAA 

FT SAVWGLVRSAQSENPGCFGLLDLDLDLALDSDLAPEVDIERDRDRDPVGGTVQPALAAA 

FT LHATADEPQLALRGGTVQAARLTRIPAPQTDRAETDPAETDRPEIDTRRPGTVLITGGT 

FT GGLGGLLARHLVAERGVRSLVLASRSGLAAEGAEKLVADLEAIiGAWAVQTCDVADGDA 

FT VAALVAGVSDEYPLTAVVHTAGVLDDGVIGSLTEERLATVLRPKADAAWHLHEATRDLD 

FT LDAFWFSSLAGVLGGAGQANYAAANTFLDALMAQRRAAGLPGVSLAWGPVTORAGGMTG 

FT TLS DAEADRLARSGVPP I S AEQGLALYDAATAGERPL WPVRLDLAALRGLGD VP ALLR 

FT GLVRTPARRTAAAGAAPSADVLTRQLAGLGGAEQEEVLLRLVRGQAAWLGHADGSAIG 

FT AGRQFQELGFDSLTAVEFRNRLNAATGLRLPATLLFDYPTPADWGHLRGRIiGTGEVSG 

FT AGSVLAALDNLEAVIAGLSLDDAGEHQLVAGRLEVLRAKWADMRSAEGAVDGGADVDIE 

FT EASDDDMFALLDDELGLN" 

FT CDS 110113 110868 

FT /codon_start=l 

FT /db_xref = "SPTREMBL : Q9L4W2 " 

FT /note= "putative thioesterase" 

FT /transl_table«ll 
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• m 

FT /gene="nysE" 

FT /product="NysE" 

FT , /protein_id="AAP71777.1" 

FT /translation="MTTSTEESLWARCFHPAPAAPVRLFCFPHAGGSASFYFPVSAQLS 

FT SVAEVFAIQYPGRQDRRKEAGVSDIiATLADQVYDALRPLLKERPSTFFGHSMGATLAFE 

FT VARRFEADDGDLVRLFASGRRAPSRVREEAVHRRSDDGIVEELKLLAGTNTALLGDEEI 

FT LRMILPAIRSDYQAIETYRCPPDVTVRAPLTVLTGDRDPKTSLDEAEAWRGHTTGDFDL 

FT KVLPGGHFFVSSEAPAIIDLLRAHLAGNG" 

FT CDS 111258 114158 

FT /codon_start=l 

FT /db_xref ="SPTREMBL:Q9L4W1" 

FT /note=" transcriptional regulator" 

FT /transl_table=ll 

FT /gene="nysRI" 

FT /functions "transcriptional activator for the nystatin 

FT biosynthesis genes" 

FT /product="NysRI" 

FT /protein_id="AAF71778 .1" 

FT / 1 rans 1 a t ion= " MRKQSGSSGLLTTLVGRDDELRTLARHAAAARDGRAGLVLLHGPA 

FT GMGKTSLLRSFTASDVCRGMTVLYGTCGETVAGAGYGGVRELLGGLGLSGGDARRSPLL 

FT EGLAARALPALTADPAGPDAATGAYPVLHGLYWLAARLMAQRPLVLVLDDVHWCDERSL 

FT AWIDFLLRRAEDLPLLWLAWRSEAEPVAPAVLADIAAQRRPTVLGLHPLGPDDIGEMV 

FT RRVFRTTAAPS FVSRVAAVSGGNPIiALARLLDELRAEGVRPDAAGERRAAEVGSHVLAR 

FT SVRCLLERRPPV^GVARAIAVLGPECTELIWVLAGVPAATVDEALLVLRRAGILAADR 

FT VDFVHDWRSAVLDDVAPPTLAELRTNAALLLSDAGRPSEELAGQLMLLPVLDQPWMAA 

FT VLRDAAAQAESRGAPEAGWCLYRVLEVEPDNVAVRIQMARALAEINPPEAMRLLKEAL 

FT SLAGDVRTRAQVAVQYGFTCLAVQESPSGVRMLEDALAELTAELGPEPGPVDRELRTLV ! 

FT ESVLLIVGADEKVTIGAVRDRAARLTMPPGDTPAQRQMLAMTTVLTAMDGRDARSAVDQ 

FT ARRALRAPGVELEPWSLLSASFALSLADEVADAQYALDLMLQYGQDNAAVWTYVLALST . 

FT RALLHHGVGAFPEALADAQTAVEILGEERWADGAVLPRVALATALVDRGEPERAEHVLD 

FT GITRPRLERFVIEYHWYLQARAYARVAmGDFQGALDLLLACGRSLEESRFSNPAFVPWW 

FT ADGAVLLATLDRHDQARELAAYGSELAERWGTARGLGLAFMAQGVAAPGRAGIDHLTEA 

FT VSLLADSPARAMEARAELLLGHAHLKRDDLRAAREHLRAAADLAQRCGAVKLGVDARKL 

FT LVTAGGRVRRMTASPLDMLTGMERTVADLAVTGASNRAIAEALFVTVRTIETHLTSVYR 

FT KLGVGGRAELSAVLETRTATSGRQPPAWVSQARGRA" 

FT CDS 114182 117043 

FT /codon_start=l 

FT /db_xref = " SPTREMBL : Q9L4W0 " 

FT /note= "putative transcriptional regulator" 

FT /transl_table=ll 

FT /gene="nysRII" 

FT /functions "presumable transcriptional activator involved in 

FT regulation of nystatin biosynthesis" 

FT /products "NysRII " 

FT /protein_ids"AAF71779' 1" 

FT /translation="MPRSKARNQPTTCTPQCAPDAHGDPTMLLECGREQRLIGDLLHRL 

FT GQGRPSVLSLTGRPGHAQNALVRWGACRARHDGLRVLRAQATPAERELRYGAVLQLLAV 

FT LDGPHGSTLDAAIRHDGPPPLPVPGIEEVLRRTGTAPTLWVEDVQWLDPASLTWLQIL 

FT LRHLGPDTPI^VLASSCGDTTAFDTDPKAPAVPGPPDTVPVARFVVPALTDRGVAATVR 

FT AVCGTPGDEEFIAALTSATAGNPAILRDALRAFVDHGLPADADHLPELHALTAGWGDH 

FT TVRALDGLPAEVNAVLRAIiAVCGDLLDFHRVRALAGAHSLSEDRIRTLLASVGLTVSVG 

FT • DKVHIRFPASKARVIEDMPAAERADLYVRAAELTHSCGVNDEDVAHLLLRSSPLGAPWV 

FT VPLLRRGFAAALRREDHHRACACLSRALQEPLDPRERSLLTLELAAAEAVARPEAGDRR 

FT LGELVRSTVADTDPTSSGEGVGVRAIDLGFARGNSEWVRRTAGEALPYAGPADREELVA 

FT LFWIJ^VRDDDAPMIPVVPRLPDRPVPPAQAGARAWQLATAGEDADKARKLARIALTGG 

FT VNESLMMPKLAACAALFATDDNDEAVHGLDTMLTAARSAHLRSMAARI FNLRARIHLCA 

FT ARLEAAERDLDSAERALPPTSWHPRALPNLIATRILVSMETGRPDRARRIjAEAPVPAGG 

FT EEGVWWPALLLARARVAADDGDWEEALRLSRECGRWLFRRHWANPAMLSWRPLAAEACL 

FT KLGDVTEARRLRDEELFFADRWGTASARGIARLTTRRLFDDDGDRAVRRIREAAALLRD 

FT SPARLAYLWSRLSQAGAETAHGDTAAAARSWQAVARMTAAHPASRLATAARTLTVPSVP 
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FT VATAPPTAWPPGWRDLSEAEKDTVLLAARGHGNRQIAEQLAVSRRTVELRLSNAYRKL 

FT RIGGRKELYLLLEALEGPVADAS " 

FT CDS . 117033. .119816 

FT /codon_start=l 

FT /db_xre f = " SPTREMBL : Q9L4 V9 " 

FT /note=" transcriptional regulator" 

FT /transl_table=ll 

FT /gene="nysRIII" 

FT /f unction="presumable transcriptional activator involved in 

FT regulation of nystatin biosynthesis" 

FT /product = "NysRI I I " 

FT /protein_id="AAF71780.1" 

FT / 1 rans 1 a t ion= " MLLERENELARIRAALDAAEAGDS SLLL INGPLGSGRSALLRRI P 

FT ELAGDGTRVLRASAAWRERDFPFGIARQLFDHLLSGAGGAGPAERTAGAEHPSRLMDTG 

FT DRPTGTGPALEVSQAVLQGAQALLADASAERRLLILVDDLQWADGPSLRWLAHLTRRLH 

FT GLRALLVCTLADGDHRGRYPLVREVAGAAHTVLRLAPLSRDATRVLLAGPQGRPPQDAL . 

FT VRAVYEASRGNPLFLTAFRSALRATGRPPGGDHFGAVRELSPTVLRDRLAGHLRIQPQP 

FT VREVAVAVAALGDHSDP VLLAQLAGVDE IGFAGARRALVDAGLLARGRDVRFVHGWRD 

FT AVDSLLTLDERERSHDDAADLLYRCGRPAEQVAGHLLAWHPGRPWSEAVLRSAAHNTUj 

FT RAGRPADAARYLRRAIiLHHRTQDGCRARILVDLATAERALDPDACVRHV.SQAVALLDTS 

FT RDRAAAVLRIPPSLLAAPSPSAVELVRQAAAGLDEPGQRDEEGADELALRLEAWLRHSG 

FT HENPVELASSVARLRRMGARPPVDSVAERELVAVLLSAGALSGRLSAAEIADTGNRILE 

FT REPATAAHAHTPLPLVMLSLFVAESVQGVASWLASEQHTRRRYATGADDVLLTAERAFV 

FT LVTQGRPAAAREHVERALVMDAGDWSEPAVMMFAAVAFELRDPALSERILERIRDRRPA 

FT GLALTATGQMLQAAVDVHFGRGRDALDTLLACGRRLETVGWRNSALLPWRPYAIGLHQR 

FT LGETDAALQLAEDELRWAREWGATTNLGRALRLKGWLLQDEGLDLLRESVEILRASSYA 

FT telartlwlgrrlpggpeaeavlreaagiaaacgvpwlaeraelglgsaivppvatlt : 

FT PSERRVASLVSRGLTNQAIATELGVSSRAVEKHLTSAYRKLGVSGRRELVNALPGR" 

FT CDS 120268. .120900 

FT /codon_start=l 

FT /db_xref= "SPTREMBL : Q9L4V8 " 

FT /note= "putative transcriptional regulator" 

FT /transl_table=ll 

FT /functions "presumable transcriptional response regulator" 

FT / p roduc t = " 0RF4 " 

FT /protein_id="AAF71781 . 1" 

FT / t ranslat ion= "MISAQTAPAGESVGPGLMASLDRDLTIKHANQEFRRRFDDSAGDV 

FT CGRSFRDLMHPSVQQPLMRQFSRLIEGKRHRFASHWAVGAQDAAFAGTLTASAVTGKT 

FT PDIAGILVLMDSSGAADAADAGVVTSQKKFLTEIDARILEGIAAGLSTIPLASRLYLSR 

FT QGVEYHVTGLLRKLRVPNRAALVSRAYSMGILNVGTWPPKWDDFIK" 

FT CDS 121589. .122350 

FT /codon_start=l 

FT /db_xref = " SPTREMBL : Q9L4V7 " 

FT /note= "putative repressor" 

FT /transl_table=ll 

FT /functions "presumable transcriptional repressor (DeoR 

FT • type) " 

FT /products "0RF3 " 

FT /protein_id«"AAF71782.1" 

FT /translations "MDAEGRRRDMLELIRRSGSADWRLAEEFAVSKETVRRDLNVLEG 

FT HGLIRRRHGGAYPMVRPGSEAVFVSRTAQPIPEESRIATAAAELLSEAETVFIDEGFTP 

FT QLIADALPRDRPLTIVTASLPWSAFATSPQANVLLLGGRVRRGTTATVDHWAVHMLSG 

FT • FVIDLAFLGAEGISRRYGLTTPDPAVAEVKAQAIRVARRPVLAGVHTKFGTASFCRFGE 

FT VGDLETIVTGAGLPVAEAHRYHLMGPKVLRV" 

FT CDS complement (122404. .123468) 

FT /codon_start=l 

FT /db_xref= "SPTREMBL :Q9L4V6" 

FT /notes "putative transcriptional regulator" 

FT /transl_tablesii 

FT /products "ORF2 " 
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FT /protein_ids"AAF717 8 3 . 1" 

FT /translations "MAQDSGQTPRSLDHVDQALVHALQITPRASWTRIGSVLGLDAVTV 

FT . ARRWNRLVETGAAWISCHPAPVLAASGQGCLAFVEIDCAPGRLLDVARALAAAPHVVAL 

FT SHVTGDRDLQLimiARDPAMLSRWVTHDLAALDGVRAARTHLAGPVHTEGSRWRLRAL 

. RHQVARLAADASRHRTDTPAFVLDELDQQLVTALSVDGRATYRALAEQCGAGPDTVRRR 
VQRLFAADMLHARCEVARPLS EWPVTVS FWGQVPAARLREVTRRVTGMREVRLCAS VI S 
RHNLHLVAWVRSLDDAQRFEVRLAERAADLTVTERAVALWHMKHGGHLLDEEGYRVGVT 
FT PLALWREPTDARRG" 
XX 

SQ Sequence 123580 BP; 15426 A; 49056 C; 42187 G; 16911 T; 0 other; 

gtatgaccta tttcgccccg tggcgtaagg agtagccggc aggtttcatc cgaaggtggg 60 

gcgcgggagc ggcgagttga ccgtgaggtg cgtgcggtgt tcccggtcgc gccggggggt 120 

gggggccggc aggtggaccg ccgtggtcag cagggcgggg gtgctgtgcc agcggccggg ISO 

gaagaccgtg acccggcgac cgcccagcag ggggcccggg accaggagtc gggcggtgaa 240 

ggcgccctgt gtcgggtcga tgacgatctc ggcctcggag aagtccagtt cgcgctgggt 300 

caggggatac cacgccttga agacgctctc cttggcgctg aagagcagcc ggtcccagtg 360 

gacgtccgga cgatgggccg ccagggccac gaggtgcggc cgttccgagg gcagggcgat 420 

ggcgttcagg acgccggccg gcagcggacc gttcggttcg gcgtcgatgc tcaccgcggc 480 

cgacagctcc gcgggggaga cggcggcggc gcggtagccg gcgcagtgcg tcatgctgcc 540 

gacgatgccc ggcggccact gcggggcgcc gcgccgattg ggcagtatgg cccggtccgg 600 

gtggccgagc cggcgcaggg cccggcgggc gagatggcgg acggtggtga actcgcgctg 660 

ccgggactcc acggcccggg cgatgacctc gcgttccgag gagagcagcc ggtcgccggg 720 

gcgcggccgg tcgtcgtacg ccgcctcggt ggcgaccgtg gcgggcagga tcagttcgat 780 

caccgcatac ctccggcgaa cggtaagaac agggggttgc ttcggggcac ggttccgtcc 840 

ttgacggggc gccgtgggcg gccgggtcag ccggccgcgg cgcccgcggt gggagtgtgg 900 

gtgcgggctg cgtgcagccg ggcgtacagg ccctgcgcgc acaggagttg atcgjtgggtg 960 

ccctgctcga cgatgcggcc ggcgtccatc acgacgatca ggtccgcgtc gcggatcgtg 1020 

gacaggcggt gcgcgatcac gaagctggtc cggcccgccc gcagggagtt catggcgcgt 1080 

tggatcagga cctcggtccg ggtgtccacg gagctggtgg cctcgtccag cacgaggacg 1140 

gccggtctgg cgaggaaggc ccgggccacg gtcagcagct gcttctcgcc ggcgctgacg 1200 

gtgcccgact cgtcgtccag cacggtgtcg tagccctgcg gcagggtgcg gatgaagcgg 1260 

tcggcacagg tcgcgcgggc cgcctcctcg atgtccgcac ggcaggcacc gggtgcgccg 1320 

tacgcgatgt tctccgcgat ggtgccgccg aacagccagg tgtcctggag caccagcccg 1380 

aagcgggacc gcaggtcgtc gcgggtcatc gtcgcggtgt cggtgccgtc caggaggatg 144 0 

cggccggagt ccggttcgta gaagcgcatc aggaggttgc cgagggtggt tttgccggcg 1500 

ccggtggggc cgacgatcgc cacggtgctg cccggttcca cggtcagcga gaggttctcg 1560 

atgaggggcg tgtcggggga gtagcggaag gacacgtcgg tgaactcgac gcggccctcg 1620 

gcgcgggcgg gcgtgccggg ccggagcggg tccggggcct gctcgggggc gtcgagcagg 1680 

gtgaagacgc gctgggcgga ggcgatgccg gactggagcc ggcccgccac cgaggcgatc 174 0 

tccacgatcg gctggctgaa ctggcgggcg tagaggatga acgcctgcac gtcaccgagg 1800 

gtcagggtgc cgtttatgac cttccaggcg ccgatgacgg ccaccagcac atagccgagg 1860 

ttggcgacga acatcatgac cggttccatg gcaccggagg cgaactgcgc cttggccgca 192 0 

gcccggtaga ccgcgtcgtt gcaggcgtcg aagcgctcct cggccgccgc gcgccggtcg 1980 

aagcccttga tcagcgcatg accggtgcac acctcctcca catgggcgtt gagggtgccg 2040 

ttcgcggacc actgcgcggc gtagtggggc tgcgcgcgct tgctgatccg ggccgcgatc 2100 

aacgccgaga ccggcacgct gagcagcatc accacggcca gcgacggcga gatcaccagc 2160 

atcagcacca gcatcgtcaa cagcgagaag atcgaggtga tcagctcggc gagggtctgc 2220 

tggagggtct gttggaggtt gtcgatgtcg ttggtggtgc ggctgagcag ctcaccggcc 2280 

ggctgccggt cgaagtggcg cagcggcagc cgggtcagct tctcccgggc gtcgcggcgc 2340 

agttcgtgga tggtgcgcca caccgcggac gccaccagcc ggccctgcgc cagcatgaac 2400 

agcgacgcca cgacgtagag cgccagcagg accagcagca gccggccgat cgcggcgaag 24 60 

tcgatcccgg gcgccgggcc cgggacgccg ccgagcacgc cgtcggcgat cagatcggtg 2520 

acccggccga gcagcagcgg gccgaacgcg ttgagcacga tcccgccgac gcccatcgac 2580 

acggcgagtg ccacggagcg gcggtgcgga cgcagcaggc cgacgagccg acgtgccggc 2640 

cggggcgcgg tccgctcctc ctcaaggtcg tccggcgagg ccatgggcgg cttcctcctc 2700 

ggtcagctgc gagagcgcga tctcgcggta ggtcgggctg gtgcgcagca gcacgtcgtg 2760 

ggtgccctgg gcgaccacgc gcccgcggtc cagcaccacg atccggtccg cgtcgcggcc 2820 

ggcggagatc cgctgcgcca cggtgatcac cgtggcgccc gcggtgtacg gcaccagcgc 2 880 

ggtccgcagc gccgcatcgg tggcctggtc gagcgccgag aaacagtcgt cgaagagata 294 0 

gatctccggc cggcgcagca acgcccgggc gagggacagg cgttggcgct ggccgccgga 3000 
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gacattgccg ccgccctggg tgatctccgc gtcgaggccg tccggcatcc gcgccacgaa 3060 

gtcggccgcc tgggcgaccc gcagcgcctc ccacagctcc tcgtcggtgg cgtccggccg 312 0 

cccgaagcgc agattgctcg ccacggtgcc ggagaacagg tacggccgct gcggcacgaa 3180 

cccgacggcg gcggcgagcg tggccgcggt cagctcgcgg acgtcggtgc cgccgacccg 3 24 0 

caccgcgccc tcggtggcgt cggccagccg cagcaccaag ttcaacaggg tcgtcttgcc 3300 

gctgccggtg ctgccgagca cggcgatccg ctcgcccggc tcgacggtca ggtcgacgtc 3360 

ccgcagcacg ggctcctcgg cgcccgggta gcggtacccg gccgcgcaca gttcgatccg 342 0 

gccggcgggc ccgcgcaccg gctgcggcgc ggccggcggc gccacgctcg acccggtgtc 3480 

caggacctcc gcgatccggc cggcacagac ccgggcccgc ggcaccgaca ggaacacgaa 3540 

ggcgagcatc acgacggaca tcaggatcag cgagagatag ctcaggaggg cgctgagcga 3600 

gccgatcggc atccggcccg cgtcgatccg gtgggagccg gtccacagca gggctacggt 3660 

gaaaccgttc atcagcagca gcacgaccgg cagcatcgtc gcgatcagcc gacccacccg 3720 

ccgcgacacc acgaggaacg cgtcgttggt ctgcgcgaac cgcgcgcgct cgtggtcgtc 3780 

gcggacgaag gaccggacca cccgcacccc ggtgatcgcc tcgcgcagca gccgccccag 3840 

ccggtccagg gtcagctgca tccgcgcgta cagggtgccc atccgggcca gcagcaggcc 3900 

gaagcagacc gccaccacca gcaccagcgc caccagcagc agtgccagcg gaacgtcctg 3960 

gcgcagcgcc agcagcacgc tgcccaggca catcagcggc gcgcagacga cgatgccgaa 4020 

gccggtctgg gcgaggttct gcacctgctg cacgtcgttc accgaccggg tcagcaggga 4080 

gggggtgccg aaccggccga tctcgcgggc ggagaagtcc aggatgcggc ggaagagcgc 414 0 

ggaccgcaga tcgcggccca tcgccgtcgc ggtccgggcg gccagcgcgg ccgcaccgag 4200 

cgcggcggcg atctgcacca gcgccaccac gcccatcacc acacccagct cggtgatgcg 4260 

cccgccgtcg ccgcgcacca ccccctggtc gataagtgcg gcgcccagcg tcggcagcag 4320 

caaagtgccc aggatctgga cgagttgaag ggcgacgaga gcggcggtgg cccaggcgta 43 80 

ggggcgcagc tgtgcccgca gaagtctcaa cagcacggag gaacaccccg gttgacggcg 4440 

gcgtgcggcc gcgcggtgga cggagtcggc cggcccgccg acccggtcat tgcaccgcag 4500 

tccgctccga aatttcacta gtgttggggg tggcaacgga cttttgacgg cccggtactc 4560 

gtgaattccc tagaaagccc gggatgcgtt gacagcatct tccgcggctt gcgagcgtgc 4620 

gggtgtctat cgtccggact gcgtattcga cgccaggtcg tggccgagct caagcgtttg 4680 

gacggtctct ttcgtgttcg agaagggttt gccatgtcca aacgagcgct gatcaccgga 4740 

atcaccggcc aggacggctc ctatctcgcg gagcacctgc tgtcccaggg ctaccaggtg 4800 

tggggtctga tccgcggcca ggccaatccc cgcaagtccc gggtcagccg cctcgcctcc 4860 

gaactcgact tcatcgacgg ggacctgatg gaccagggca gcctggtctc cgccgtcgac 4920 

accgtgcagc ccgacgaggt ctacaacctc ggcgccatct cgttcgtgcc gatgtcctgg 4980 

cagcaggccg agctggtcac cgaggtcaac ggcatgggcg tgctgcgcat gctggaagcc 5040 

atccgcatgg tcagcggact gtccacctcc cgcacggtca gcccgcgcgg ccagatccgc 5100 

ttctaccagg cgtccagctc ggagatgttc ggcaaggccg ccgagacgcc gcagcgcgag 5160 

accaccctct tccacccgcg cagcccctac ggcgcggcaa aggcgtacgg gcactacatc 5220 

acccgcaact accgcgagtc cttcggcatg tacgcggtct ccggcatgct cttcaaccac 5280 
gaatccccgc gccgcggcca ggaattcgtc acccgcaaga tcagcctggc ggtcgcccgc 53*40 

atcaagcaag gcctccagga caagctggca ctcggcaacc tcgacgcggt gcgcgactgg 5400 

ggctatgccg gcgactacgt ccgcgccatg cacctgatgc tccagcagga cgccggcgac 5460 
gactacgtca tcggcaccgg gcagatgcac tcggtgcgcg acgcggttcg gatcgcgttc 5520 
gaacacgtcg gcctgaactg ggaggactac gtcgtcatcg accccgacct ggtgcggccc 5580 
gccgaggtcg aggtgctgtg cgccgacagc gccaaggccc aggaccgcct cggctggaag 5640 
ccggacgtcg acttccccac cctcatgcgc atgatggtcg attccgacct ggcgcaggtt 5700 
tcccgcgaaa accaatacgg cgacgtgctg ctcgccgcca actggtagca gttctcaagc 5760 
tttcgaaaac tagtgaattc ctgccggaat tccgacgaca ctcgccacat ggattccccc 5820 
ggtgagtggc gaatccaggt ggcgaatccg aacgtaccgc cgaaacggcg tggagaagtc 5880 
ggacgccatt cacgtgcggc cgtcggctgt cgagatgggt tgagttgaga tggacaacga 5940 
acaaaaactc cgggattacc tcaagcttgc gacggccgac. cttcgacgca cccggcggcg 6000 
cgtccacaag ctggagtcgg cggcccagga accggtggcc atcatcggca tgacctgtcg 6060 
ctaccccggc ggcgtccgca gccccgaaga cctctggcgc atggtcgagg ccggcgagca 6120 
cggcgtcacc ccgttcccca ccgaccgcgg ttgggacctg gaggcgctgg ccgccgcgcc 6180 
gaccgcctcc ggcggattcc tgcacgacgc acccgacttc gacgcggact tcttcggcat 6240 
ctcgccgcgc gaggcggtcg ccatggaccc gcaacagcgc gtcgtcctgg aatccgcctg 6300 
ggaggcgttc gaacgcgccg gcatcgaccc gacgtccgtg aagggcagcc gcaccggagt 6360 
cttcatcggc gcgatggccc aggactaccg ggtcggcccc gccgacggcg ccgagggctt 6420 
ccaactcacc ggcaacaccg gcagcgtgct gtccggccgc atctcctaca ccttcggcac 64 80 

ggtcggcccc gccgtcaccg tcgacaccgc ctgctcctcc tccctcgtcg ccgtccacct 6540 
cgccacccag gcgctgcggg ccggcgagtg caccctcgcc ctcgccggcg gcgtcaccat 6600 
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catgtccggc cccggcacct tcatcgaaat gggccgccag ggcgggctct ccgccgacgg 6660 

ccgctgccgc tccttcggcg acaccgccga cggcaccggc tgggccgaag gcgtcggcat 6720 

cctcgtcctg gaacggctgt ccgacgccgt ccgcaacggc cacgagatcc tcgccgtcgt 6780 

ccgcggcacc gccgtcaacc aggacggcgc ctccaacggc ctgaccgccc ccaacggccc 684 0 

ctcccagcag caggtcatcc agcaggccct ggtcaacgcc cgactcgccg ccggggacat 6900 

cgacgtcgtc gaggcgcacg gcaccggcac caccctcggc gaccccgtcg aggcccaggc 6960 

cctgctcgcc acctacgggc agaaccgccc ggcggaccgg ccgctgctgc tgggctcggt 7020 

caagtccaac ctcagccaca cccaggccgc cgccggcgtc gccggcgtga tcaagatggt 7080 

catggcgatg cggcacggca ccctgccgcg caccctgcac gccgaggagc ccacccacca 7140 

cgtcgactgg tcgcagggcg ccgtgcggct gctgaccgac accaccgact ggcccgccac 7200 

cggggcgccg cgccgcgccg ccgtctcctc cttcggcatc agcggcacca acgcccacac 7260 

catcatcgag caggcccccg aaccgcagcc cgaggacgcc gcgaccgcgc aggacgacgc 7 32 0 

cgccggcagc acgccggcca ccgcccccgt agtgcccggc gtcgtaccgg tcctgctctc 7380 

cggccgcacc ccggacgccc tgcgcggcca ggccgcggcc ctgcgcgccg ccctcgacac 7440 

cggccggcgg cccgacctgc tcgacctcgc acactccctc gccaccaccc gcgccgggtt 7500 

cgagcaccgc gccgtcctcc tcgccaccga ccaccccgcc ctgaccgacg gcctcaccgc 7560 

cctcgccgac gccgacgacc cggccgccgc ccccgcctgg atcaccggca ccacccgggc 7620 

cgagacccgg ctcgccgtcc tgttcaccgg ccagggcgcc caacgcctcg gcgcgggacg 7680 

ggaactcgcc gcccgtttcc cggcgttcgc caccgccctc gacgcggcgc tcgacgcctt 7740 

caccccgcac ctcgaccgcc ccctgcgcga ggtcctgtgg ggcaccgacg ccgccctgct 7800 

cgaccgcacc gcatacgccc agccggccct cttcgccgtc gaagtggcgc tctaccggct 7860 

gatcgaatcg ttcggcgtcc gccccgacca cctcgccggc cactccgtcg gcgagatcgt 7920 

cgccgcgcac ctcgccgggg tcctctccct ggccgacgcc gccaccctcg tcgccgcccg 7980 

cggtcgcctg atgcaggcgc tgcccgacgg cggggcgatg atcgccgtcc aggcgtcgga 804 0 

agccgacgtc gccccgctgc tcgccgggca cgaggaccag gtcgcgatcg ccgccgtcaa 8100 

cggcccctcc gccgtcgtcc tgtccggcgc cgaagccacc gtcaccgcgc tcgccgaaca 8160 

gctcgccgcc gacggccgca agacccgccg gctgcgcgtc tcgcacgcct tccactcgcc 8220 ! 

gctcatggag ccgatgctcg acgccttccg cgccgtcgtc gaagacctca cgctccagcc 8280 

gccgctcctg ccggtcgtct ccaacctgac cggcaagccc gccaccgtcg cccaactcac . 8340 

ctccgccgac tactgggtcg accacgtccg gcacgccgtc cgcttcgccg acggcatcga 8400 

ctggctcgcc cggcacgaca ccaccgcctt cctcgaactc ggccccgacg gcgtgctgtc 8460 

cgccatggcc caggactgcc tggacgccgc cgacgcagac gccgtcaccc tccccgccct 8520 

gcgcgccggg cgccccgagg agcacaccct caccaccgcc ctcgccggtc tgcacgtcca 8580 

cggcgccacc ctggactgga ccggctgctt cgccggcacc ggcgcccgcc gcaccgacct 8640 

gccgacctac gccttccagc gccgccgcta ctggcccaag gccctccaga gcggcaccgc 8700 

cgacctgcgc tcggtcggcc tcggtgccgc ccaccacccg ctgctctccg ccgccgtctc 8760 

cctcgccgac gcaggcggca ccctgctcac cggccgcctc tcccggcaga cccacccctg 8820 

gctcgccgac cacaccgtcc gcggcaccac cctgctgccc ggtaccgcct tcctcgaact 8880 

cgccgtccgc gccggcgacg aggtcggctg cgaccgcgtc gaggaactca ccctcgccgc 8940 

accgctcctg ctgcccgaac agggcggcgt ccaggtccag ttgtggatcg gcaaccccga 9000 

cgtgtccggt cgccgcaccg tcaacgtcca cgcccgcccc gacaccggcg acgacacccc 9060 

ctggaccgcc cacgccaccg gcgtcctcac caccgccgac gcctcccgcc agctcccggc 9120 

ttcgtccgag cagggcggca cccccctcgc cggcgacccc caccccgccc tcgacgcggc 9180 

ccagtggccc ccggccggcg ccgaaccgct gccgctggac ggccactacg accgcctcgc 9240 

cgacggcggc ttcggctacg gcccggtctt ccagggcctg cgcgccgcct ggcgcggcgg 9300 

cgacgtcgtc tacgccgagg tcgagctgcc cgaggccggc cggtccgacg ccgaggcgtt 9360 

cggcctccac cccgccctgc tcgacgccgc cctgcacgcc gcgcccttca ccggcctcgg 9420 

cgaacgcggc cggggcggcc tgccgttctc ctgggagggc gtctccctcc acgccggcgg 94 80 

cgccaccacc ctccgcgtcc gcctgacccc ggtcgccgac gacgcgctcg ccctgaccgt 9540 

cgccgacggc accggcgcgc ccgtgctgtc cgtcgactcg ctcgtcctgc gcagcgtggc 9600 

gacccaacag ctcgacacgg ccgccgccgt cgcccgtgac gccctcttcc gcctcgactg 9660 

gacccccgtc cagccgaccg ccaccgaccc cgggcccgtc gccctcctcg gcgccgaccc 9720 

cttcggcctg ctcacccacg ccggattcgc cgacgccccg gcatacccgg acctcgccgc 9780 

cctcgccgcg gcggacggcc cggtcccgac caccgtcgtg ctgtccctcg ccggcaccgg 9840 

ggacgacgcg gccgacccgg cccggtccgc acaccgctgc gccgcggagg ccctcgccgc 9900 

cgtacagacc tggctcgacc accatgagcg cttcgccgcc gcccgcctgg tcttcgtgac 9960 

ccgtggtgcg acggtcgggc gtgatgttgc tgctgctgcg gtgtggggtc tggtgcgttc 10020 

ggcgcagtcg gagaatccgg ggtgttttgc tctggtcgat ctggatccgg atggtgcggt 10080 

gggtgcggct gcgctcgtcg ctgcgttggt cagtggtgag ccgcagcttg cggtgcgcgg 10140 

tgatgtgttg cgggtcgcgc gtctggtgcg gcggccgctc accgaggtcg gtgcgggtgc 10200 
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tgatggcacc 
ggtcctggtc 
cgagtatggg 
tggggagttg 
tgtgaccgat 
tcatgcggct 
cgcggtgctg 
ggacctggac 
ggccaactac 
gggactgccc 
caccctgacg 
ggcgcagggc 
ggtccgcctg 
gttgatccgc 
cgcccagcgc 
ccgcggtcag 
tgccttccgc 
caccgtcacc 
cctcgccacg 
gccggccgcg 
ccccggtggc 
cgtctcgggc 
tgaccacccg 
cgatccgggg 
gttgttgttg 
gcggggtagt 
ggcgagtccg 
tcgggtggcc 
gtcgtcgttg 
ggcgttggcc 
gcagcggggt 
gggctggtcc 
tggtcacgag 
tggtttgacg 
tggtggtctg 
cggtgatccg 
gcggccgttg 
tgtggcgggc 
gcatgtggat 
tgagcaggcg 
catcagcggt 
cgcggccggt 
ccggagtgcc 
cgccgaactc 
ggccgccgtc 
cgaccgcccc 
gctgttcacc 
gccggagttc 
gccgctgcgc 
gaccggttgg 
gagttggggt 
gca'cgtcgcc 
gctgatgcag 
gatcgcggcg 
ggtgatctcc 
gcgcaagacc 
gctggacgcc 
cgtctccgac 
ggtgcggcac 
cggggccacc 



ggggatggcg 
actggtggta 
gtgcgggatc 
gtggcggagc 
cgtgccgcgg 
ggtgtgctgg 
cggccgaagg 
gcgttcgtcg 
gcggccgcga 
ggcctgtcac 
gacgtcgacg 
ctggccctct 
gacctccccg 
gtccgggcgc 
ctgcgccgcc 
gtcgccctcg 
gacctcggct 
ggcctgcgcc 
tacgtcctgg 
gttgcggtgg 
gtcagctccc 
ttcccgacca 
ggtacctcct 
ttcttcggga 
gagtcgtcgt 
cggacgggtg 
gagttcgagg 
tacacgttgg 
gtggcgatgc 
ggtggtgtga 
ttgtcgccgg 
gagggcgtcg 
attttggctg 
gcgcccaatg 
acggccggtg 
atcgaggcgc 
ttgttgggtt 
gttatcaaga 
gcgccttctt 
gcctggccgg 
accaatgtgc 
gcggcgcggc 
ctgcggggcc 
gtcgatgtcg 
gtggcgcagg 
gaccccgccg 
ggacagggca 
gccgcggcgt 
gaggtggtgt 
acgcagccgg 
gtccgtccgg 
ggggtgctga 
gcgctgccga 
cacctcgacg 
ggtgacgagg 
aagcggctgc 
ttccggatcg 
ctcaccggcc 
gtccgagagg 
accttcctgg 



tcggggatgg 

cgggtggtct 

tgctgttggt 

ttgcgggtgt 

tggtggagtt 

atgacggcat 

tggatgctgt 

tcttctcctc 

acgccttcct 

tcgcatgggg 

ccgaacggct 

tcgacgctgc 

tcctccgcgc 

gccgagccgc 

tggacgagga 

tcctcggcca 

tcgactcgct 

tgcccgccac 

acgagttgtt 

cggacgatcc 

ccgaggacct 

accgtggttg 

acacgcgctc 

tgagtccgcg 

gggaggcgat 

tgttcgcggg 

gtttccaggg 

ggttggaagg 

actgggcgat 

cggtgatgtc 

atggccggtg 

gcgtgttggt 

tggtgcgggg 

gtccgtcgca 

acgtggatgt 

aggcgttgtt 

cggtgaagtc 

tggtgttggc 

cgcatgtgga 

agacgggtcg 

atgtcatcgt 

gcacgccggg 

aggccgcccg 

cactgtcgtt 

accgcgacca 

tcgtcgaggg 

gccagcgggc 

tcgacgcggt 

tcgccgagga 

ctctgttcgc 

acttcgtggc 

cgttggagga 

ccggcggcgc 

acacggtggc 

aggccgccga 

gggtgagcca 

tcgccgaggg 

ggcgcgccga 

ccgtgcggtt 

aactgggctc 



ctctggtgtg 
gggtgcggtg 
cagtcgcagt 
gggtgcgcgg 
ggttggcggg 
ggtgggtgcg 
ctggcatcta 
cctcgccggg 
ggacgcgctg 
accgtgggag 
ggcccgctcc 
cgtggccggg 
acggggtgaa 
cgtcgccggc 
cggccgcgac 
cgcgaccggt 
gaccgccgtc 
cctggtcttc 
gggcacggat 
gatcgtcatc 
gtggcgcgtg 
ggacgtcgaa 
gggtgggttc 
ggaggcgttg 
cgagcgggcc 
ggtgatgtac 
cagtgggagt 
cccggcggtg 
gcaggcgttg 
gacgcctgcg 
caaggcgttt 
cctggagcgg 
ttcggcggtc 
gcagcgggtg 
ggtggaggcg 
ggcgacgtat 
gaatctgggg 
gatgcggcat 
ctggtccgag 
ggtgcggcgg 
cgagcaggcg 
tgccgtgccg 
cctgctcgga 
ggcgaccacc 
gctgatcgcc 
cgaggccgcc 
cgccatgggg 
gtgtgccgtt 
cggcagcgac 
cgtcgaggtg 
cggccattcc 
cgcctgccgt 
gatgatcgcg 
gatcgccgcc 
aacgatcgcc 
tgccttccac 
gctgacctac 
cgatgcggag 
cgccgactgc 
cgacggcctg 



tcgttctcgg 

ttggcgcgtc 

ggtgaacgtg 

gtgcgggtgg 

catgcggtgt* 

ttgaccgggg 

catgaggcga 

gtcttcggca 

atgacgcggc 

cagtcgggcg 

ggtgtcccgc 

accgacgcca 

gtgccgccgc 

tccgccaccg 

gagatggtcc 

ggcgacgtcg 

gaactgcgca 

gactacccga 

gccgaggtgg 

gtgggcatgg 

ctcaccgaag 

tccctctatc 

ctgcatgagg 

gcgaccgatt 

ggtattgatc 

agcgattaca 

tcgccgagtt 

acggtggata 

cgtagtggtg 

gtgtttgtgg 

gcggatgcgg 

cagtcggacg 

aaccaggatg 

atccggcagg 

catggtacgg 

gggcgggatc 

catacgcagg 

ggtgtggtgc 

ggtgcggtgg 

gcgggtgtct 

ccgggcgcca 

gtgctgctct 

cacctccagg 

cggtcccgct 

tcgctggggg 

ggacgcggcc 

cgtgaactcc 

ttcgacccgc 

gaggccgcac 

gcgctgttcc 

atcggtgaga 

ctggtggccg 

atccaggcca 

gtcaacgggc 

gccacgttcg 

tcgccgcgga 

cgggcgccgc 

gtgtgcaccg 

gtgcggacgc 

ctgaccgcga 



gtgagggtgc 

atctggtggc 

ccgtgggtgc 

ttgcgtgtga 

ccgcggtggt 

agcggttgtc 

cccgcggcct 

gtcccggcca 

gccgggcgga 

gaatgacggg 

cgctctccgt 

cctgcgttcc 

tgctgaggtc 

cgggcaacct 

tggacctcgt 

acgccggccg 

accgcctcaa 

ccgtccggca 

cgaccgtgca 

cctgccgcta 

gcaccgacgc 

acccggaccc 

cgggggagtt 

cccagcagcg 

cggtgagttt 

gcgcgatgtt 

tggcgtcggg 

cggcgtgttc 

agtgtgggtt 

actttgctcg 

ccgatggtgt 

cggtgcgcaa 

gtgcgtccaa 

cgttggccag 

gtacgacgct 

gtgagcctga 

ctgctgcggg 

cgcggacgtt 

agctgctcag 

cctccttcgg 

aggcgatcgc 

cggggcgtgg 

cccgacccga 

tcgagcagcg 

cgctggccgc 

ggaccgcggt 

acgaggtgca 

tgttggaccg 

tgctggacga 

ggctggtgga 

tcgcggcggc 

cgcgggcgac 

ccgaggacga 

cgcagtccgt 

ccgaacgcgg 

tggacgggat 

gcatcccgct 

cggagtactg 

tgcgcgacgc 

tggccgagga 



10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 • 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 
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caccctcggt gacgaccacg acgccgaact ggtgccgatg ctgcgcgccg ggcgcgccga 13860 

ggaactggcc gcggccaccg ccctggcccg cctccaggtg cgcggcgtgg acgtggactg 13 920 

ggcggcgtac ctcgccggca ccggcgcccg acgcaccgac ctgccgacct acgccttcca 13 980 

gcacgcgtac tactggccgc agctgccgac cccggccgcc gccctcgccg ccgccgatcc 14040 

cgccgaccag cagctgtggg ccgctgtgga gcgcggcgac gcccgcgaac tcgccgacat 14100 

cctcggcctg ggcgaacagg acctcacgcc gctggactcc ctgctgcccg ccctcacctc 14160 

gtggcggcgc ggcaaccagg agaagcacct cctggacacc ctgcgctacc gcgtggagtg 14220 

gacacgactg agcaagccga ccgccccggt cctcgacggc acctggctgc tggtcgcctc 14280 

cgacgccacc gcggccgacc agccagccct cctcgacggc ctggccgacg ccctcggctc 14340 

gcacggcgcg cgggtgcgtc gcctgcttct ggacgactcc tgcgcggacc gcgcggtgct 14400 

cgccgaacga ctggcgcgga ccgccgacgt ggacgccgcg acccaggtgc tgtccgtgct 14460 

gccgctcgac gagcgggacg ccgacgactg cccgccgctc acccgcggac tggcgctgac 14520 

cgtcgcgctc gtccaggccc tcgccgacac cggcgcccag ggccggctgt ggaccgccac 14580 

ccgcggcgcc gtctccacca accccgccga cccggtcacc caccccgtcc aggccgctgc 14640 

ctggggcctg ggccggggcg tcgccctgga gcacccacgg ctgtggggcg gcctggtcga 14700 

cctgccgcag gtcttcgacg agcgggccgg acagcggctc gccgggatcc tcgccgtcaa 14760 . 

ggacgcaccg gacggcgagg accaggtggc gctgcgggcc accggagtct ccggccgccg 14 820 

gctcgtccgc cacaccgtcg aagcgctgcc cacggccgcg gagttcaccg ccaccggcac 14880 

tgtcctgatc actggtggca ccggtggcct gggcgccgag gtcgcccggt ggctggcccg 14 940 

cgccggcgcc cagcacctcg tcctgaccag ccgccgcggc ccggacgcgc cgggcgccgc 15000 

cgaactccgg gccgaactgg agggctacgg gccgtcggtg tccgtcgtcg cctgcgacgt 15060 

cgccgaccgg gacgcgctcg ccgccgtcct caccgcactg cccgaggaac tgccgctgac 15120 

cggtgtcgtg cacaccgcag gcgtcggcca ctacggcccg ctggacaccc tgagcaccgc 15180 

cgagttcgcc ggcctcaccg ccgccaagct cgccggcgcc gcccacctcg acgccctgct 15240 

cgccgaccgc gaactggact tcttcgtcct cttcggctcc atcgccggtg tctggggcag 15300 

tggcaaccag agcgcctacg gcgccgccaa cgcctacctc gacgcgctcg ccctgcaccg 15360 

ccgcgcccgc ggcctcgccg cgacctccgt cgcctggggc ccgtgggccg aggccggcat 15420 [ 

ggccgccgac gatgccgttt ccgagaccct gcgccgccag ggcctcggcc tgctcgaccc 15480 

ggccccggcc atgaccgagc tgcgccgcgc cgtcgtccgg caggacgtca ccgtcaccgt 15540 

cgccgacgtg gactggcagc gctacgcacc gctgttcacc tccgcccggc ccagcgccct 15600 

gatcgccggc ctgcccgagg tccgcgccct cgccgccgac gagcgcaccg agcaggacgc 15660 

caccggcgcc tccgaggtcg tcacccgcgt ccgcgccctg gccgaacccg agcaactgcg 15720 

cctgctgacc gacctcgtcc gcaccgagtc cgccaccgtc ctcggccaca gctccgccga 15780 

cgccgtgccc gagggccgcg ccttccgcga cgtcggcttc gactcgctga ccgcggtcga 15840 

gctccgcaag cgcctgggcg ccgcgaccgg gctgtccctg cccagcacca tggtcttcga 15900 

ctacccgaca ccgctggaac tcgcccagta cctgcgggcg gagatcctcg gcgcggtgct 15960 

ggaagtcgcc ggcccggtcg ccaccggcgg cgccgacgac gagccgatcg ccatcatcgg 16020 

catggcctgc cgcttccccg gcggcgtcag ctccccggaa cagctgtggg acctggtcgc 16080 

ctccggcacc gacgcgatca gcgagttccc cgtcaaccgc ggctggcaga ccgggcacct 16140 

cttcgacccg gaccccgacc ggcccggcac cacctactcc acccagggcg gcttcctcca 16200 

cgaggccgac gagttcgacc ccaccttctt cggcatctcg ccccgcgagg cgctggtcat 16260 

ggacccgcag cagcggctcc tgctggagac cacctgggag tccttcgagc gcgccgggat 16320 ' 

ccgcccggaa accctccgat ccaccctgac cggcaccttc gtcggctcca gctaccagga 16380 

gtacggcctg ggcgcgggcg acggcaccga gggccacatg gtcaccggca gcagccccag 16440 

tgtgctctcc ggccgactgt cgtacgtctt cggtctggaa ggcccggcgg tcaccgtcga 16500 

caccgcctgc tcgtcctcgc tcgtggcgct gcacctggcc tgccagtcgc tgcgcaacgg 16560 

cgagagcaac ctggccgtcg ccggcggcgc cacgatcatg acgacgccca acccgttcat 16620 

cgcgttcagc cggcagcgcg ccctcgccaa ggacggccgc tgcaaggcgt tctccgacga 16680 

cgcggacggc atgacgctcg ccgagggcgt cggcgtcgtc ctcgtcgagc ggctctccga 1674 0 

cgcgcagcgc aacggccacc cggtcctggc cgtcctccgc ggctccgcca tcaaccagga 16800 

cggcgcctcc aacggcctga ccgcgcccaa cggcccgtcc cagcagaggg tcatccgcca 16860 

ggccctcgcc aacgcccgcc tcgcgcccgg ggacatcgac gccctggagg cgcacggcac 16920 

cggcacaccg ctcggcgacc ccatcgaggc ccaggcactg ttcgccacct acggccgcga 16980 

ccgcgacccc gagagcgcgc tgctgctcgg ctcggtgaag tccaacatcg gccacaccca 17040 

gtccgccgcg ggcatcgcca gcgtcatcaa gatggtcatg gcgctgcgcc actccgaact 17100 

gccgccgacc ctgcacgccg acgcgccgtc ctcgcacgtg gactggtcgg ccgggacggt 17160 

ccggctgctg acccaggcgc gcgcctggcc ggagaccggt cgcccgcgcc gggccgcggt- 17220 

gtcctcgttc ggcatcagcg gtaccaacgc ccatgtcctg ctggagcagg cgcccgtcgc 17280 

ggacaccccg gccgaggagc ggcccgccgt ggcgccggtc ccgatcgccg ccggcgtcgt 17340 

cccgtgggtg gtcaccgccc gcagcgccgc cgccctgcgc ggccaggccg agcgcctcct 17400 
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cgcgcacgcc 
gtcgctggtc 
caccgacccg 
ccgtggcgtg 
gtgggtgggg 
cgagtgtgcg 
tgtggtgggt 
gatggtgtcg 
gcattcgcag 
ggcgcgggtg 
gatgatgtcc 
gcgggtgtcg 
ggcgctggac 
ggtggactac 
gctggcggag 
ctggctggac 
gcggttcgcg 
cagctcgcac 
ggccgtcgcc 
ggccgccgag 
cggtgcgtcc 
cccgcccgcc 
ggtcgaggac 
cgccgccgtg 
ggacgcctgg 
cctgaccggc 
ggcgttggag 
ccgcgccgtc 
cctcgccgcc 
cctgaccgtg 
gttgacccgc 
acaggtcacc 
cctcgacctg 
cggcgccctc 
catcgtgcgg 
cacactggtc 
acgcggcgcc 
cgaactcgtc 
caccgaccgc 
ccgcaccgtc 
ggcggacttc 
gctcgacgac 
cagcggcgcc 
ccgccgggcc 
caaactgggc 
gcagctggcc 
cgccgacgtg 
cttcgacgag 
cgtcgccgag 
caccctgctg 
ggacctcacc 
cctgcgcaac 
ctaccccaac 
cgccgccgcc 
gatcgtcggc 
gatcgccctg 
cgagggcctc 
attcctgcgc 
gctgtcgatg 
cgccggcatc 



gaaaccgtcg 
tccgcgcgcg 
ctggccgcgc 
gcggacgtcg 
atggggtccc 
gcggcactcg 
gcgccgtcgt 
ttggctgcgt 
ggtgagatcg 
gtggcgctgc 
gtcgcgctgt 
gtggccgccg 
gcgctgcacg 
gcctcgcact 
ctggcgccgc 
accgcgcgga 
gacgcggtgg 
ccggtgctgt 
gccggcaccc 
gtcttcgtgc 
cggatcgacc 
ccggaggccg 
ggtgacgtct 
ctgcccgccc 
cgctaccgcg 
acctggctgc 
acctacggcg 
ctgcgggagc 
gccgagcgga 
gcactgatcc 
ggcgcggtct 
ggcatcggct 
cccgccgccc 
ggcagcgacg 
gccgaggcca 
accggcggct 
gagcacctgg 
gcggaactgg 
gacgcggtcg 
gtgcacaccg 
gacccfggtgc 
gaagagctgg 
cacgccgcct 
aacggactgc 
cgggtcgatc 
ctgagcggcc 
gactgggaga 
gtgccggagg 
ggcgagttcg 
gagaccgtcc 
gaccagcggg 
cggctcgcct 
ccggccgcgc 
ggcgccgccg 
atgagctgcc 
gacgaggtcg 
tacgacccgg 
gacgtcgccg 
gacccgcagc 
gacccggtcg 



gaaccgccct 
cccgtttcga 
tgcgcgccgt 
agggtcggac 
aactccttga 
ccgagttcac 
tggagcgggt 
tgtggcgttc 
ctgctgcggt 
ggagtcaggc 
cggtggacgt 
tcaacggccc 
cccggctgac 
cgcaccaggt 
gcacgtcgga 
tggacgccgg 
cggacctgct 
cgatggcggt 
tgcgccgcga 
gcggtgtgga 
tgcccaccta 
tcgccgccgc 
ccgcgctgac 
tgacctcctg 
tcgcctggaa 
tggtcaccgc 
ccgaggtgcg 
ggctggccgg 
ccgacgcggt 
aggccctcgg 
ccaccggccg 
ggaccgcggc 
tcgacgcccg 
accagctggc 
ccgccgggcg 
ccggcaccct 
tcctgatcag 
ccgagtcggg 
ccgcgctgct 
ccgccaccat 
tgcacgccaa 
acgacttcgt 
acgtcgccgg 
ccgccctgtc 
cccagatgat 
tgcagcgggc 
cctaccaccc 
tccgccggct 
ccgccgcgct 
gcaccgaggc 
ccttccgcga 
ccgtcaccgg 
tcgccgccta 
ccgtcccgac 
gctaccccgg 
acgcgatctc 
accccgaccg 
agttcgaccc 
agcggctcct 
gccagcgcgg 



gccggccgcc 
gcaccgtgcc 
cgcgacggac 
ggtgttcgtg 
tgagtcggcg 
cgactggtcg 
cgatgtggtg 
ccgtggtgtg 
ggtgtcgggt 
cattggtcgt 
gctcgaaccg 
gcgctccgtc 
cgccgacgac 
cgaggacctg 
ggtgccgttc 
ctactggttc 
ggcggcggag 
gcaggaggcg 
ccagggcggc 
cgtggactgg 
cgccttccaa 
cgacccggac 
cgccgcgctc 
gcgccgggcc 
acccctcggc 
cgacggcatc 
ccggctggtc 
cgcggaggac 
accgggcacc 
cgacgccgaa 
ggccgacgag 
gctggagcac 
ggccgcccag 
catccggccc 
gcccgccggc 
cgccccgcac 
ccggcgcggc 
caccgaggcg 
ggccgacctc 
cgagctgcac 
ggtcaccggc 
cctgtactcc 
caacgcctac 
gctgtcctgg 
ccggcgcagc 
gctggacgac 
cgtctacacc 
caccgcggcc 
gcgcgccctg 
ggcgtccgtc 
cgtcggcttc 
cctgacgctg 
tctgcacggc 
cggcgccccc 
cggggtcggc 
cggcttcccc 
gcccggccgc 
gggcttcttc 
gctggagacc 
cagccgcacc 



ggaccgctcg 

gtcgtcgtcc 

gggccctcgc 

ttccccggtc 

gtgttcgcgg 

ctggtcgatg 

cagccggcgt 

ttgccggatg 

gcgctgtcgt 

gcgttggcgg 

cggttggtcg 

gtggtcgccg 

atccgggccc 

cacgaggaac 

ttctcgaccg 

cgcaacctgc 

taccgcgcat 

atcgacgagg 

accgaccgct 

gcggggctgt 

cacgaacacc 

gacgcggcct 

ggcaccgacg 

cgccgcgacc 

ggcaccctgc 

gacgacaccg 

ctggacgagg 

gtgaccggca 

tccctggtgc 

atcgacgctc 

ctgaccgcgc 

ccgcagcgct 

cggctcgccg 

tccggggtct 

acctggacgc 

ctcgcccgct 

acggccgccc 

accgtcgccg 

aaggccgacg 

accctggacg 

gcccaggtcc 

tccaccgccg 

ctcgccgcgc 

ggcatctggg 

ggcctggagt 

aacgagaacg 

tccggccgac 

gccgagcaga 

tccgacgccg 

ctcgggctgt 

gactcgctga 

ccctcgacga 

gagctggccg 

gacgccgacg 

tccgccgagg 

gccgaccgcg 

acctactccg 

gggatctcgc 

gcctgggagg 

ggcaccttcg 



acatcggcct 

cgcccgcggg 

ccgtggtcgc 

agggttcgca 

agcggattgc 

tgctgcgggg 

cgttcgcggt 

cggtggtggg 

tgcgggacgg 

ggcggggagg 

agttcgaggg 

gcgagcccga 

gccggatcgc 

tgctggaggt 

tgaccggcga 

gcggacgggt 

tcgtcgaggt 

ccggcgtgcc 

tcctgctgtc 

tcgaggggac 

tgtgggccgt 

tctggaccgc 

aggactccgt 

gctccaccgt 

cgcacccgtc 

atgtggcagg 

agtgcgtcga 

tcgtctccgt 

tcggcaccgc 

ccgtatgggc 

ccgtccaggc 

ggggcggcac 

ccgtgctgtc 

tcacccgccg 

cgcgcggcac 

ggctggccca 

cgggcgccgc 

cctgcgacat 

ggcgcaccgt 

ccaccaccct 

tcgccgaact 

gcatgtgggg 

tcgccgagca 

ccgacgacct 

tcatggaccc 

tgctcgcggt 

ccaccccgct 

gcgccgggac 

agcagcagcg 

cctccgccga 

ccgccgtcgg 

tggtcttcga 

gcgcccggtc 

acccgatcgc 

acctgtggcg 

gctgggacgc 

tccagggcgg 

cgcgcgaggc 

cgttcgagca 

tcggcgccag 



17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
18540 
18600 
18660 
18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
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ctaccaggac 
cggcacgctc 
cgccgtcacg 
gtccctgcgc 
cccgatgtcg 
ggcgtacgcg 
ggagcggctg 
cgccgtcaac 
gcgcgtcatc 
ggagggccac 
cacctacggc 
catcggccac 
ccaggaaggc 
gtcctcgggc 
gcgccgcgcc 
acaggccccc 
ggtcctgctc 
cttcgtcgag 
gcgcgccgcg 
cggcctgcgc 
acgcggccgg 
cgaactccac 
cgacgacgga 
ggccgcgctc 
gctgttccgc 
cggcgaactc 
ggtcgccgcc 
ggaggccgcg 
cgccgccgtc 
cctcgccgac 
cttccactcg 
gacctaccac 
cgaacaggtc 
cgacggcatc 
ggacggcgtg 
gctgccgacc 
ggcccacgcg 
ccggaccacg 
cgccacgagc 
gcgggacgac 
gatggtcccc 
ctggcgctac 
cggccgctgg 
ggcctgggca 
caccaccgac 
gttcagcggt 
gcccgccgcc 
cgcgccgctg 
cgcgcccgaa 
ccggttcggc 
gcgcgcggtg 
cttcctccgc 
cccggcggcc 
ccgccgcctg 
cgcccccggc 
cgccgcctgc 
cgacgccccg 
cgcgctcaca 
cctgcacgag 
cggcgtcctc 



tacgcctccg 
tccagtgtgc 
ctcgacaccg 
aacggggaga 
ttcgtcggct 
gacggcgccg 
tccgacgccc 
caggacggcg 
cgccaggcgc 
ggcaccggta 
caggaccgcg 
acccagatgg 
gtggtgccca 
gccatcgggc 
gccgtctcct 
gcggacgagg 
tccggccgcg 
gagcggcccg 
ctggaacgcc 
gccctgtccg 
accgccttcc 
gaccgctacc 
ccggaccggc 
ctggaccgga 
ctgctgacgt 
gccgccgcgc 
cgcggccggc 
gaggacgagg 
aacgggccgc 
ctcttcgccg 
ccgctgatgg 
ccgccgacga 
cgcacgcccg 
gactggctcg 
ctcagcgcca 
ctgcgcggcg 
cacggcgccc 
ctgccgacct 
gcccacacgc 
gtcgccgccc 
gcgctcaccg 
cgcgtcacct 
ctcgtgctcg 
gccgacgtcg 
cgcgccgcgc 
gtgctgtccc 
ctcaccctca 
tggaacgtca 
caggccgccg 
ggcaccctcg 
ctcgcggcta 
cgcctggccc 
ggcaccgtcc 
gcccgcgacg 
gccgacgcgc 
gacgccgccg 
ctgtgcgcgg 
ccggagaact 
ctgaccgccg 
ggcgccgccg 




gcgtgcccaa 
tgtccggccg 
cctgctcctc 
gctcgctggc 
tcagccggca 
acgggatgac 
gcgccaacgg 
cctccaacgg 
tggccaactc 
ccgccctcgg 
cccccgaacg 
catccggcgt 
agtccctgca 
tgctcaccga 
ccttcggcat 
cgcccacgcc 
gcgaggccgc 
aggcccacct 
gcgccgccgt 
acggccggcc 
tgttcaccgg 
cggtgttcgc 
cgctgcgcga 
ccggctacgc 
cctggggcct 
acgtcgccgg 
tcatgcaggc 
tcctgccgct 
ggtccgtggt 
ccgacgggcg 
acgccatgct 
tcccgttcgt 
actactgggt 
ccacccaggg 
tggcccggga 
accggcccga 
gcgtcgactg 
acgcgttcca 
ccggatccgc 
tcgccgcctc 
cctggcgccg 
ggaagccgcg 
tcccgcacga 
agaccgccct 
tggccgcccg 
tgctgccgct 
ccaccaccgc 
cccgcggggc 
tctggggcct 
acctgcccgc 
ccgacggcga 
acgccccggc 
tgatcaccgg 
gcgccaccca 
tccgcgccga 
accgcgacgc 
tgttccacac 
tcgccgcggt 
acctggacct 
gacagggcaa 



cagcgagggc 
ggtgtcctac 
ctccctggtc 
cctggccggc 
gcgcgccctc 
cctcgccgag 
gcaccaggtg 
cctgaccgca 
cgcggtggcg 
cgaccccatc 
gccgctgctg 
cgccagcgtc 
catcgaccgg 
acgcaccccg 
cagcggcacc 
cgccgacccg 
gctgcgcgcc 
caccgacctc 
catcgccgcc 
cgaccccggc 
acagggcagc 
cgacgcgctg 
ggtgctgttc 
ccagcccgcg 
gaccccggac 
cgtgctgtcg 
gctgcccgag 
cctggagggg 
cgtcgccggc 
ccggaccaag 
cgacgacttc 
gtcgaacgtc 
cgggcacgtc 
cgacgtccac 
gagcctcacc 
ggaacctgcc 
gagcgggtac 
acgcgagcgg 
cctcgacgcc 
cctggacctg 
gcgccgcggc 
cggcggcgcc 
ccaccaggac 
gggcaccacc 
gatcaccgaa 
cgccaccggc 
cgtccaggcc 
cgtggccgtc 
gggccgcgcc 
caccctggac 
ggacgcggtg 
cggccccgac 
cggcaccggc 
cctgctgctc 
actggaggaa 
gctggccgcg 
cgccggcgtc 
gctgcgcgcc 
cgccgctttc 
ctacgccgcg 



tccgaaggcc 
ctcttcggct 
gccatgcacc 
ggcgtcagca 
gccgaggacg 
ggcgtcggcc 
ctcgccgtga 
cccaacggcc 
cccggcgaca 
gaggcgcagg 
ctcggctcgg 
atcaagctcg 
ccctccaccc 
tggcccgaga 
aacgtccaca 
ccgcgggacg 
caggccgccc 
gcccactccc 
gaccgcgaca 
ctggtccagg 
cagcgccccg 
gacgaggtgc 
gccgcgcccg 
ctgttcgccg 
tacctggccg 
ctggacgacg 
ggcggcgcga 
ctcaccgacc 
gtcgaggagg 
cggctgcggg 
gccgccgtcg 
agcggcggcc 
cgcgccgcgg 
accttcctgg 
gacccgtccc 
ctggtcaccg 
ttcgccgacc 
tactggcccg 
gagttctggg 
gacgacgcca 
gagcagaccg 
accgcacccg 
cgtcaggacg 
accgtccggc 
gccgccggcg 
gacgccggcc 
ctcggcgacg 
ggccgcgccg 
gtcgccctgg 
ggccaggccg 
gccctgcggc 
accgcccgca 
ggcatcggcg 
accagccgcc 
ctgggcgccc 
ctcctcgccg 
gtcgaggacc 
aagaccgtcg 
gtgctgttct 
gccaacgccc 



acatgatcac 

tcgagggccc 

tggcctgcca 

tcatgtccac 

gccgctgcaa 

tggtgctgct 

tccgcggctc 

cgtcccagca 

tcgacgtcct 

ccctgctcgc 

tgaagtccaa 

tccgcgccct 

acgtcgactg 

ccggccggcc 

ccatcctcga 

gcctggtgcc 

gcctgctcgc 

tcgccacctc 

ccctgacccg 

gcaccgcggg 

gcatgggccg 

tggcccggct 

actccgccga 

tcgaggtcgc 

gccactccgt 

cctgcactct 

tggtcgccct 

gggtgtccgt 

acgtgctcct 

tgagccatgc 

cccgcgggct 

tggccaccgc 

tccgcttcgc 

agctcggccc. 

gcacggcact 

cggtcgccgc 

acggcgcgcg 

acaccaccgc 

ccgccgtcga 

ccgtcaccgc 

aactggactc 

ccgccctcac 

acgcgaccgc 

tgacggtcac 

accagggccc 

accccggtgc 

ccggcatcga 

aacaggtcac 

aactgccggc 

cccgccggtt 

cctccggcgt 

ccgccttcga 

gccacgtcgc 

gcggcccggc 

gggtcaccct 

aactgcccga 

acgtcgtgga 

ccgcccacca 

cctccacggc 

acctggacgc 



21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22140 
22200 
22260 
22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
233-40 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
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cctcgccgaa caccgccgct cccacggcct gaccgcgctg tccgtcgcct ggggcccgtg 24660 

ggccggctcc ggcatggtcg ccgacgccgc cgaactcacc gaccgggtac ggcgcggcgg 24720 

cttcgaaccg ctcgcccccg aaccggccgt gcgcgccctg ctgcgcgcca tcgagaacga 24780 

cgacaccacc gtcgcgctcg ccgacatcga ctgggagcgc ttccagcgcg ccttcgccgc 24 840 

ggtccgcccg ctgccgttcg tcgccgacct ccccgagacc ggccgggcca cccccgcgac 24900 

cgccaccggc gccgccaccg gcctgcggca gcaactcgcc gaactgccgg agcacgagcg 24960 

cccggcagcg gtcctggacc tgctgcgtac ccaggtcgcc gccgtcctcg gccacgccga 25020 

cccgcgcacc gtcgaggacg accacgcctt ccgcgacctg ggcttcgact cgctgaccat 25080 

cctggaactg cgcaacgccc tcaacgccgc caccggcctg agcctgcccg ccaccctggt 25140 

ctacgacctg cccaccccgc gcgagatggc ggacttcctg ctcgccgaac tcctcggcac 25200 

cctgcccacc gacaccgccg cgaccgtcgc cagcacggcc tcccccaagc tctcagcttc 25260 

gttcgagcag ggcggtaccc ccttcgacga cccgatcgcc gtcatcggca tcggctgccg 25320 

cttccccggc ggcgtcacca ccccggagga gctctggcag ctcctcgacg agggccgcga 25380 

cggcatcagc cgcttccccg acgaccgcgg ctgggacctc gccgcgctgg gcgccggcgc 25440 

ctccgacacc ctggagggcg gcttcctgac cggcgtcgcc gacttcgacg cccggttctt 25500 

cggcatctcg ccccgcgagg cgctggccat ggacccccag cagcggctgc tgctggagac 25560 

cacctgggag gcgctggagc gggccggcat cgacccgacc acgctgcgcg gctccaccac 25620 

cggcgtcttc gtcggcacca acggccagga ctacccgacg ctgttgcgcc gctccgcctc 25680 

ggacgtggcc ggctacgtcg ccaccggcaa caccgccagc gtgatgtccg gccgcctgtc 25740 

ctacgcgctc ggcctcgaag gcccggccgt caccatcgac accgcctgct cctcctcgct 25800 

cgtcgccctg cactgggccg gccgggcgct gcgcgccggc gagtgcgacc tcgtggtggc 25860 

cggcggcgtc tcggtcatgg ccagcccgga ctccttcgtc gagttctcca cccagggcgg 25920 

cctggcaccc gacgggcgct gcaaggcgtt ctccgacgcc gccgacggca ccgcctggtc 25980 

cgaaggcgtc ggcatcctcg tcctggaacg cctctccgcc gcccgccgca acggccacca 26040 

ggtcctcggc ctgatccgcg gcaccgccgt caaccaggac ggcgcgtcca acggcctgac 26100 
cgcgcccaac ggcctctccc agcagcgcgt catcgcccag gcactcgccg acgcccgcct . 26160 

gcgccccgcc gacatcgacg cgatcgaggc gcacggcacc ggcaccaccc tcggcgaccc 26220 : 

gatcgaggcc cgcgccctga tcaccgccta cggccgggac cgggacgccg aacggccgct 26280 

gctgctgggc accgtcaagt ccaacatcgg gcacacccag gccgccgccg gtgccgccgg 26340 

cgtcatcaag atgctgatgg cgatgcgcca cggcaccctg cccaggacgc tgcacgtggg 26400 

caccccgtcc agccacgtcg actggagcgg cggcaccgtc gcgctcctcg acgacgcgcg 26460 

gccctggcca cggaccgggc agccgcggcg cgccggcgtc tccgccttcg gcgtcagcgg 26520 

caccaacgcc cacgtcgtcg tcgagcaggc cccggaaacc gaagcccccg ccgccccggc 26580 

cgccgagccg gcgccggagg ccacgcccac cgtcgtcccc tgggtcgtct ccggacgcag 26640 

ccgggaagcg ctccaggcgc agctggaccg gctcaccgcg cacaccgccg cccaccccgc 26700 

gcgctcggcg gcggacgtcg gccgctcgct ggccaccgac cgcacgctct tcccgcaccg 26760 

cgccgtgctg ctcgccggcc cggacggggt gcgcgaggcc gcccgcgccg ccgcgccccg 26820 

cacccccggc cgcaccgcgt tcctgttctc cggacagggc gcccagcacg ccctgatggg 26880 

ccacgacctg taccagcgct tcccggtcta cgccgacgca ctggacaccg tcctcgccca 26940 

gttcgacacc gtgctggacg tcccgctgcg cgccgcgctg ttcgccgcgc cgggcacccc 27000 

cgaggccgcg ctcctggacc agaccggctt cacccagccc gcgctgttcg ccgtcgaggt 27060 

cgcactgttc cggctcgccg agtcctggcg gctgacgccg gacttcgtcg ccggccactc 27120 

catcggcgag atcgccgccg cgcacgtcgc cggggtgttc tccctggagg acgcctgcac 27180 

gctggtcgcc gcccgcgcct ccctcatgca gcaactgccg cgggacggtg cgatggtggc 27240 

cctggaagcc accgaggacg aggtcgcgcc gctgctcacc gacggcgtcg cactcgccgc 27300 

ggtcaacggc ccccgctcgg tggtcgtcgc gggcgccgag gacgccgtcc gcgcggtcgc 27360 

cgaccggctc gccgccgacg gccgccgcac ccgccggctg acggtcagcc acgccttcca 27420 

ctcgccgctg atggacccga tgctcaccga cttcgcccgg gtcgcggagg gcctgaccta 27480 

ccacgagccg cgcatccccc tcgtctccac cctcctcggc gccccggccg gcgccgaact 2 7 540 

gcgcaccccc gactactggg tgcggcacgt ccgcgagacc gtgcggttcg ccgacggcgt 27600 

gcgcgccctg cacgacgccg gcgccggcac cttcgtggag atcggcccgg acggcgtgct 27660 

caccgccctg acccagcaga ccctcgacac cgtcgaggcc ggcgcgcccg ccgtcgtcgt 27720 

gccgctccag cgccgcgacc gcgccggcga cctcgccctc ctggagggcc tggccaccct 27780 

gcacacccac ggcaccggcc cgtcctggcc cgcctacttc gaggccaccg gcggccaccg 27840 

gaccgatctg cccacctacg ccttccagcg ggagcggtac tggcccgaac tcggcgcacc 27900 

cgtcgccacc gccccgcagg acccggcggc ctggcgctac cacgagacct gggccccgct 2 7960 

gccggcaccc gaggcggccg cgccggccgg ccgcgccctg gtcctcgtcc cggccgggaa 28020 

ccgcgacacc gcgtggatga cggccgtcgc cgacgcgctc ggcgccgaca ccgtcacggc 28080 

cgaacccgac gccctggccg agcagttgac ggccgccggc gacacaccct ggcgcgtcgt 28140 

ggtgtcgctg ctcgccgccg cgtccgaggg gctgcccgcg gacggcgcct ggcccgccgc 28200 
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cctcctcgcc accctggacg aggccggcgt gcacgcgccc ctgtggtgcg tcacccgcgg 282 60 

cgccgtcgcg gtcgcggggg aggccccgac cgccgtcggc caggccgccc tgtggggcct 28320 

gggccgggtc gccgcgctgg accacccgga ccgcttcggc ggcctggccg acctgcccgc 28380 

cgacaccgac gcgcacgccg ccgggctgct cgccgcgcac ctggccgcgc cgggcaccga 28440 

ggccgagatc gcggtccgcg ccaccggcgt ccacgcccgt cgcctggtcc gtacgccggc 28500 

cgccgccgac ggtgccacct ggctgccgac cggcaccgtc ctggttgtcg gcggcaccgg 28560 

cggcaccggc accatgggcg gccgggccgc ccgctggctg gtccgcgagg gcgcccgcca 2 862 0 

cctcgtcctg accgcccccg acggcaccac gaccgccgcg gacaccgagg ccctgacggc 28680 

cgaactggcc gcgctcggcg cccggatcac cgtcgtggac cacgacccca ccgccccgga 2 874 0 

cggcttcgcc gcgctcctcg acggactgcc cgacgacacc ccgctcaccg cggtcgtgta 28800 

cgcgccggag gccgacgccg cccccggcac cgcggccgag ctgtccgccg cactcgcccc 28860 

cgtcaccgcc ctaggcgccg ccctcaccgg ccggccgctg gacgccttcg tcctcttcgg 28920 

ctccatcgcc gggctctggg gcgtgcgcgg ccgggccgcc gaggccgcgt ccggcgccta 28980 

cctcgacgcc ttcgcccgcg cctgccgcga ccgcggcacc ccggcactgg ccgtcgcctg 29040 

gggcgcctgg gccgacctgg tcggcccgtc cctcgccgcg cacctgcgga tgaacggcct 29100 

gccggtgatg gacgcggaca ccgcactgac cgccctcagc cgggccgtcg ccgacgggtc 29160 

cgccgccgag gcggtcgccg acgtccgctg ggagaccttc gcgcccctcc accacgaggc 2 9220 

ccgccgcacc gccctgttcg acgccctgcc cgaggcccgc ggcgcgctcg cggaggccgc 29280 

ccgggaccgc gccgaccgga agaccgccgc cggcgactac ggccggtggc tcgccgagca 29340 

gcccgccgcg gaccacgacg ccatcctgct ggcactggtc accgagaagg ccgcgaccgt 29400 

cctcggccac gccgaccacg acctgctcga acccgacctg cccttccgcg acctgggctt 29460 

cgactcgctg accgcggtcg acctgcgcaa ccagctcacc gcggaaaccg gcctcaccct 29520 

gcccgccacc ctcgtcttcg accaccccaa cccggccgcc ctcgccgccc acctgcgcgc 29580 

ccaactcctc ggcgaggcga gcgactccgc cgcaccggtg gccgcccccg tcgccctcgg 2 964 0 

tgccgacgac gacgcgatcg tcatcgtcgg catggcctgc cgctaccccg gcggggtcac 2 9700 

ctcgcccgag gacctgtggc agctggtcgg cgacgaggtc gacgcggtcg gcgacttccc 29760 

gaccgaccgc ggctgggacc tggccgcgct cgccggcgac ggaccgggcc gcagtgccac 29820 ; 

cgcccagggc ggattcctct acgacgccac cgacttcgac cccggcctgt tcggcatctc 29880 

gccgcgcgag gccctggtga tggacccgca gcagcggatc ctgctcgaaa cgtcctggga 29940. 

ggccctggag cgggccggca tcgacccggc gacgctgcgc ggcagcggca ccaccggcgt 30000 

cttcgtcggc ggcggctccg gcgactaccg gccgccggag gaggccgggc agtggcagac 30060 

cgcccagtcc gccagcctgc tctccggtcg cctcgcctac accttcggca tccagggccc 30120 

caccgtgtcg gtcgacaccg cctgctcctc gtcgctggtc gcgctgcacc tggccgcgca 30180 

ggccctgcgc gccggcgaat gctcgatcgc gctggccggc ggcgtcaccg tgatggccac 30240 

cccggtgggc ttcgtcgagt tcagcgccca gggcgccctg tcgccggacg gccgctgccg 30300 

cgccttctcc gacgacgcca acggcaccgg ctggtccgaa ggcgtgggca tgctcgtcgt 30360 

cgaacggctc tccgacgccc gccgcaacgg ccaccgcgtc ctcgccgtgc tccgcggctc 30420 

cgccatcaac caggacggcg cgtccaacgg cctgaccgcc cccagcggcc ccgcccagca 30480 

gcgcgtcatc cgccaggccc tcgccaacgc ccgactgcgc cccgccgaca tcgacgccgt 30540 

cgaggcccac ggcaccggca ccaggctcgg cgaccccatc gaggcccagg ccctgctcgc 30600 

cacctacggc caggaccgcg agcggcccgt gctgctcggc tcgctcaagt ccaacatcgg 30660 

ccacacccag gccgcctccg gcgtcggcgg cgtcatcaag atggtcctcg ccatgcagca .30720 

cggcgaactg ccgcgctccc tgtacgccga gaacccctcg tcgcacgtgg actggaccgc 30780 

cggccgcgcc cacctgctca ccgccaggac cccgftggccc gactccggtc ggccgcgccg 3 084 0 

cgccgccgtc tcctccttcg gcgccagcgg caccaacgcc cacgccatcc tggagcagcc 30900 

gccgcgcgag gaactccccg cgcgccccgc ggacgacggc gccccgctgc cgttcttgct 3 0960 

ctccggccgc tcgcagaacg ccctgcgcgc ccaggcccgc cgactcctgg cccgcctcac 31020 

cgcccacccc gacacccggg ccgccgacct ggcgtactcc ctggcgacca cccgggccgc 31080 
cttcgagcac cgggccgcga tcaccgccac cgaccacgac ggcctccgca ccggcctgac 31140 
cgccgtcgcc gagggcacca ccgccccgca caccgccgaa caccacctcc agggcaccgg 31200 
aaagcgcgcc gtgctcttct ccggccaggg ctcccagcgc ctgggcatgg gccgcgaact 31260 
gcacgagcgc cacccggtgt tcgccgaggc gttcgactcc gtactggccc gcctcgacga 31320 
ccggctcgac acccccctgc gggacgtcgt ctggggcacc gacgaggagg cgctgcacgc 31380 
caccgggaac acccagcccg ccctgttcgc cgtcgaagtc gcgctctacc gcctgatcga 31440 
atcctggggc gtgcggcccg acttcgtggc cggccactcc gtcggcgagc tcgccgcggc 31500 
ccacgtcgcc ggggtgctct ccctggacga cgcctgccgc ctggtcgccg cccgcgccgc 31560 
cctcatgcag cgcctcccgg ccggcggcgc catgatcgcc gtcgaggcca ccgaggacga 31620 
ggtcaccccg ctcctcaccg acggcgtgtc cctcgccgcg gtcaacggac cgaccgccgt 31680 
ggtcctctcc ggcgcgggcg acgccgtgac cgccctgggc caggcgctgg ccgaacgggg 31740 
ccaccgcacc acccggctgc gggtcagcca cgccttccac tcgcacctca tggacccgat 31800 
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gctggcggac ttccgcaccg tcgccgaggg cctggaatac cacccgccgc gcatccccgt 31860 

ggtctccaac ctcaccgggg acgtcgccga cgcggccgac ctgtgctccg ccgactactg 31920 

ggtgcgccac gtccgcggca ccgtacggtt cgccgacggc gtgcgcacca tggccgaccg 31980 

cggcgtgcac ctcttcctcg aactcggccc ggacgccgtg ctgtcggcca tggcccgcca 32040 

gtgcgcaccg gacgccgtcg tcgtcccggc cctgcgccgc aaccgcgacg aggacgagac 32100 

gctggtcggc gccgtcgcgc gactgcacgt ccacggcgcg ggtccgcgct gggacgcgta 32160 

cttcgccggc cgcggcgccc agtggctgga ccttccgacg taccccttcc agcgcggccg 32220 

cttctggccg gagtcccttc cgggcgccgc atcggccgcc ccggcagccg gacagccggc 32280 

cgagaccgac gcggccttct gggacgccgt cgcacaggag gacttcaccg cattggaatc 3234 0 

cgtactcgac gtcgagagcg acgcactgtc caaggtgctg ccggccctga tggactggcg 32400 

cagccgccag gccgacgagt cccaactggc aggctggcgc caccgcatcg tctggaagcg 32460 

gctcaccggc gccgccctgg cacaccgcaa ggcgctcagc ggcacctggc tcgcggtggt 32520 

ccccgagggc ttcgccgacg acccctgggt gaccaccacc ctggacggcc tcggtaccca 32580 

cctcgtgcat ctggaggtcg cggaggccga ccgggccgcg ctggccgacg cgatcgcggc 32640 

ccgcaccgcc gacggcaccc gcttcggcgg cgtaatctcc ctgctggccc tgcgcgagga 32700 

gctcaccggc gcggtgcccg aggggaccgc cctgaccacc accctcctcc aggccctcgg 32760 

cgacgccggc gtcgacgcac cgctgtggtg cgtcacccgc agcgccgtct ccgccggccg 32820 

caccgaccgg ccgcaccgac cgctccaagg cgccgtctgg ggcctgggcc gggtcgcggc 32880 

ccttgagtac ccgcagcgct ggggcggcct ggtggacctg ccggaggagc ccgacgagcg 32940 

gtccgcggcc ggcctcgccg ccgtcctggc cggtctggac ggcgaggacc aggtcgccgt 33000 

gcgcggcacc gcggtgctcg cccgccgcct ggtgccggct cccggccgca agccgtcccg 33060 

gccctggcac ccgtccggca ccgtcctggt caccggcggc accggcgccc tcggcgcgca 33120 

cgtcgcccgc cgcctggcca aggacggcgc ccagcacctc gtcctgctca gccgccgcgg 33180 

cccggacgct cccggtgcgg cggaactgcg cgcggaactg gacgcgttgg gcaccgacgt 33240 

cacggtcgcc gcctgcgacg tcgccgaccg cgaccagctg acggccgtcc tggacgcgct 33300 

gcccgccgac cggccgctga ccggtgtggt gcacaccgcc ggcgtcctcg acgacggcgt 33360 

actggaccgg ctcacccccg agcggttcca ggaggtgttc cgcgccaagg tcacctcggc 33420 : 

cctgctgctg gacgagctga cccgcgaccg cgagctggcc gcgttcgtcc tcttctcctc 33480 

cgcctccgcc gcggtcggca acccgggcca ggccaactac gccgctgcca acgccgtcct 33540 

ggacgcgctc gccgaacagc gccgggtgct cggcctgccc gccacctcgg tctcctgggg 33600 

tgcctgggga ggcggcggca tggccgacgc cgacggcgcg gacgaggccg cccggcgcgc 3 3660 

cggcgtcggc gccatggacc cgcacctcgc cgtggaagcc ctgctgcgcc tggtcgccga 33720 

gaaggagccg accgcggtgg tcgccgaggt ggccctggac cggttcgccg gcgccttcgg 33780 

cggcagccga cccagcgccc tgctgcggga gttccccggc taccgcgagg cgctcgccgc 33840 

ccaggcggag caggccgcgg acggcggcgg gctggccgcc cgactggccg cgctgccgcc 33900 

cgcccgccgc ctggacaccg ttgtggacct ggtgcgcacc cgcgccgcgc aggtgctcgg 33 960 

ctaccccgac accgaagcgg tcgccgccga acggtccttc cgcgacctgg gtgtcgactc 34020 

gctcggcgcc gtcgagctgc gcaaccaact gagcgcggcc accggcctga acctgccggc 34080 

gacgctggtg ttcgaccacc cgacccccct ggtcctgggg gagcacatcc tcggcgggct 34140 

cttcccggac gagcccgccg ggtccgacga cgagacggag atccgggccc tgctggcctc 34200 

cgtcccgctc gaccaactgc gggagatcgg ggtcctggag cccctgctcc agctcgccgg 34260 

acgcggcggc cgggccgcgg acggcgacga cggcgagtcc gtcgactcga tgacagtggc 34320 

agacctggtg cgggccgcgc tcaacggcca gtccgacctg tagcgcgatt gatggagcag 34380 

acgatgaacg cgcccgagaa ccccgagacc cccgagaaca acgtagtcgc cgcactccgc 34440 

gccgcggtca aggagaccga ccggctccgg cggcagaacc ggatgctggt cgcggcggcc 34500 

aaggaaccga tcgccgtggt cggcatggcc tgccgcttcc ccggcgccgt cgactccccg 34560 

gaagcgctgt gggagatggt cgccaccggc accgacgtga tctccggatt ccccgacgac 34620 

cgcggctggg acctggaggc gctgcgcaac agcggcaccg acgcccgcga caccgacgtc 34680 

agccagcgcg gcggattcct ggactgcatc gccgacttcg accccggctt cttcgggatc 34740 

tcaccgcgcg aggcggtcac catggacccg caacagcggc tcctgctgac caccgcctgg 34 800 

gaggccgtcg agcgggccgg catcgacgcc accacgctgc gcgccacccg caccggcgcg 34 860 

ttcatcggca ccaacggcca ggactacgcc tacctgctcg tccgctccct ggacgacgcc 34 920 

accggcgacg tcggcaccgg catcgccgcc agcgccgcct ccggtcggct ctcctacacc 34980 

ctcggcctcg aaggccccgc gctcaccgtc gacaccgcct gctcctcgtc gctggtcgcc 35040 

ctgcacctgg ccgtgcaggc gctgcgcaac ggcgagtgcg gcatggcgct ggccggcggc 35100 

gtcaacgtga tggccacacc gggctcgctg gtcgagttca gccgccaggg cgggctggcc 35160 

cgggacggcc gctgcaaggc gttcgcggac gccgccgacg gcaccggctg gtccgagggc 35220 

gccggcgtgc tgctgctgga acggctctcc gacgcccagc gcaacggcca cccggtgctc 35280 

gccgtggtcc gcggctccgc cgtcaaccag gacggcgcct ccaacggctt caccgccccc 35340 

aacggcccct cccagcagcg cgtcatccgc caggccctcg ccaacgccgg cctggccacc 35400 



http://^^nvw.ebi.ac.uk/cgi-bin/emblfetch?id=AF263912&S 10/1 1/2002 Christina Belisario 



ggcgacatcg 

gcgcagagca 

atcaagtcga 

atcatggcga 

cacgtcgact 

actggcaggc 

gtcatcgtcg 

accccgcgga 

gccaccgcgc 

gacaccgcgt 

accggcaccg 

gcccccgacg 

ggccagggcg 

gcacgggccc 

gaggtgatct 

ctgttcgccg 

ttcgtcgccg 

ctggaggacg 

ggcggggcga 

ggcgtcgcga 

gccaccctcg 

gtcagccacg 

gccgagggcc 

gtcgccgacg 

gtccgcttcg 

ctcggcccgg 

gtccccgtcc 

ctgcacaccg 

cgcaccgacc 

cggcccgccg 

gccatgtccg 

cacccctggc 

ctggaactgg 

ctcgccgcgc 

gccgcggacg 

gacgccgcct 

ctcgacacca 

gaccgctacc 

tggcgccgcg 

gacgccttcg 

tccgccgacg 

gccgcgggcg 

atcaccgccg 

gccgccggcc 

atggactgga 

ggtaccgacc 

ggactccgcg 

cccgacgtgg 

gacctgaccc 

cgctcccggc 

gacctggccg 

cgactgctgc 

ccggccctcc 

ggccggctcg 

cgcctgggca 

gcccgacgac 

ttccgtgacg 

gaggcggccg 

cgggtcatgg 

ctcacccccg 



acgcggtcga 
tcctcgccac 
acatgggcca 
tgcggcacgg 
ggaccaccgg 
cgcgccgcac 
aacaggcccc 
ccctgccgtg 
tgctcgacca 
tctccctcgc 
acggcaccgc 
cccacgaagg 
cccagcgcct 
tcgacaccgc 
ggggcaccga 
tcgaggtcgc 
gccactccat 
cctgcacgct 
tggtcgccgt 
tcgccgcgat 
ccgtcgccgc 
ccttccactc 
tgtcctacgg 
gcaccctgct 
ccgacggcat 
acggcacgct 
tgcgcaagga 
ccggcgtccc 
tgccgaccta 
acgccaccgg 
tcgccgggtc 
tcgccgacca 
cggtccgcgc 
cgctgatcct 
acgacggcgg 
gggcccagca 
ccacctggcc 
gcgccaacgg 
acaccgagat 
gcctgcaccc 
gcgacgaccg 
cggacgcgct 
tcgacccgca 
ccgacgccgg 
cgccccgcac 
cgatcggcct 
acggtgtcga 
tcgcggtacc 
gcaccgtcct 
tgctgctcgt 
ccgccccggt 
tcgtcgacct 
tggacgccga 
cccgcctgga 
gccgggccaa 
ccctcaccgg 
tgctcaacgc 
gtgtggtcgt 
gcatgctctt 
tcccggccga 



ggcgcacggg 
ctacggccag 
cacccaggcc 
cgtcctcccg 
cagcgtcgaa 
cggcatctcc 
cgacaccccc 
gctgctctcc 
cctcgaccgc 
caccacccgc 
cggacgggac 
acacgccgcc 
gggcatgggc 
cgtggacctg 
cgacgcgccg 
cctctaccgc 
cggcgagatc 
ggtggccgcc 
cgaggccacc 
caacggcccc 
ccgactcgcc 
gccgctgatg 
cgaaccgcag 
cggcactgcc 
ccgcgccctc 
cgccgccctg 
ccgggacgag 
ggtggactgg 
cgccttccag 
cctcggcctg 
cgacgagctc 
cgtcgtcggc 
cgccgaccag 
gcccgccacc 
ccgcgacctg 
cgccaccggc 
gccccgcgac 
actcgactac 
ctacgccgag 
ggccctcttc 
cagcctcctg 
gcgcgtccgg 
gggccgcccc 
caccgccgac 
cgtccacgcc 
gaccgaggcg 
cgccctcggc 
gctgcgcggc 
ggccctgctc 
cacccgcggc 
ctggggcctg 
cgacgacacc 
cgagccgcag 
ctccggccgc 
gggcagcctc 
ccacgaggtc 
gttggggatg 
cgaggtcgga 
cggcggcttc 
ctggtcctgg 



accggcaccc 
gaccgcgccc 
gcgtccggcg 
cggaccctgc 
ctcctcaccg 
tccttcggcg 
gccgaggcgg 
gcccgcaccg 
cccgacggcg 
gccgccctgg 
gccctgaccg 
ggacgcaccc 
cgcgaactcc 
ctcgacgccg 
ctcaacgaga 
ctgatcgaat 
gccgccgcgc 
cgcgccgggc 
gaggacgagg 
acctcgctcg 
gaacagggcc 
gacccgatgc 
atcccggtgg 
gactactggg 
accgacgccg 
gcccagcagt 
gagcccgccg 
acggcgttct 
tacgagcgct 
accgccgccg 
ctgctcaccg 
ggcatggtct 
gtcggctgcg 
ggcaccgtcc 
cgcttcttca 
cggatcaccg 
gcggaacccg 
gggcccgtct 
gtcgccctgc 
gacgccgtcc 
ccgttcgcct 
atcaccagct 
gtcgtctccg 
caccgtgccg 
ccggccaccc 
ctcaccgccg 
gaactcaccg 
gccaccgacc 
caggaatggc 
gcggtcgccg 
gtgcgctccg 
gccgagtccg 
gccgtggtcc 
ggcctcgtcc 
gacggcctcg 
cgcgtcggca 
tatcccgggg 
ccggaggtca 
ggaccgctcg 
gagacgggtg 



cgctcggcga 
acccggtgct 
tcgccggcgt 
acgtcgaccg 
acgcccaccc 
tcagcggcac 
ctgacgacac 
gcgccgccct 
accgcgggcc 
aacaccggct 
cctggctggc 
gctgcgcggc 
acgcccgttt 
aactgggcgg 
ccggcttcac 
cctggggcgt 
acgtcgccgg 
tgatgcaggc 
tcagcccgct 
tcgtctccgg 
gccgcaccac 
tcgcggagtt 
tctccaacct 
tccggcacgt 
gcgtcggcgc 
ccgcccccga 
cggtcgccgc 
acgccggcac 
actggcccaa 
accacccgct 
gcaccctgtc 
tcttccccgg 
accgggtcga 
agatgcagat 
cccggcccgg 
agggcgagcg 
tcgacatcga 
tccgcggcct 
ccgaaggcac 
tgcacagcac 
ggaacggcgt 
gcggccccga 
tcgaatcgct 
acgcgggctc 
ccgccacctg 
ccggccccga 
ccggcgacga 
acgggccggc 
tggccgagga 
acggcgagcg 
cccagtccga 
ccgcccaact 
gcgagggcac 
cgccgcccgg 
ccctgctgcc 
tccgcgccgc 
atgcggggct 
ccggcctggc 
gcatcgccga 
cgtcggtgcc 



ccccatcgag 
gctcggctcg 
gatcaagatg 
gccctccacc 
gtggcccgag 
caacgcccac 
tccgccccgc 
gcgcgaccag 
caccgccctg 
cgccgtcgtc 
gcacggcacc 
cctcttctcc 
cccggtgttc. 
caccctgcgg 
ccagcccgcc 
cgccccggac 
ggtgttctcc 
gctgccgcgc 
gctcaccgac 
cgacgagacc 
ccggctgcgg 
ccgcgcggtc 
caccggcgcg 
ccgcgaggcg 
cttcctcgaa 
cgccgtctcc 
actggcccgg 
cggcgcccac 
ggccacctac 
gctcggcgcc 
gctcgccacc 
caccggcttc 
ggaactcatg 
cgcggtcggc 
ggacgacccg 
cgtcctcgcc 
cggcctctac 
gcgcgccgta 
cgccgacgcc 
cctcttcgcc 
gtccctgcac 
cgccgtggag 
gacgctgcgc 
cctcttccgc 
ggccgtcctc 
caccgtcacg 
ccggccggtg 
cggtgcccac 
gcgcttcgcc 
cggcccgctc 
gaaccccggc 
cccgttgctg 
cgtccgggtc 
caccccctgg 
ccaccccgag 
gggcctgaac 
gttcggttcg 
acccggcgac 
cgcccggctg 
gttggtgttc 
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35460 

35520 

35580 

35640 

35700 

35760 

35820 

35880 

35940 

36000 

36060 

36120 

36180 

36240 

36300 

36360 

36420 

36480 

36540 

36600 

36660 

36720 

36780 

36840 

36900 

36960 

37020 • 

37080 

37140 

37200 

37260 

37320 

37380 

37440 

37500 

37560 

37620 

37680 

37740 

37800 

37860 

37920 

37980 

38040 

38100 

38160 

38220 

38280 

38340 

38400 

38460 

38520 

38580 

38640 

38700 

38760 

38820 

38880 

38940 

39000 
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ctcaccgcgt actacgccct gaaggagttg ggtggtctgc gggcggggga gaaggtgctg 3 9060 

gtgcatgccg gtgccggtgg tgtcggtatg gcggcgatcc agatcgcccg gcatgtcggt 39120 

gccgaggtgt tcgccacggc cagtgagggc aagtgggacg tgctgcgctc cctgggcgtg 39180 

gccgacgacc acatcgcctc ctcccgcacc ctcgacttcg aggcggcctt cgccgaagtc 3 9240 

gccggcgacc gcggcctgga cgtcgtactg aacgcgctgt ccggcgagtt cgtcgacgcc 3 9300 

tcgatgcggc tgctcggcga cggcggccgg ttcctggaga tgggcaagac cgacatccgc 39360 

gccgcggact ccgttcccga cggcctctcc taccactcct tcgacctcgg catggtcgat 39420 

ccggaacaca tccagcggat gctgctcgac ctcgtcgagc tgttcgaccg cggcgcgctg 39480 

gccgcgttgc cggtccgcag ctgggacgtg cgccgcgccg gcgaggcgtt ccgcttcatg 3 9540 

agcctggccc agcacatcgg caagatcgtg ctcaccgtgc cgcaacccct cgaccccgac 3 9600 

ggcaccgtgc tcctcaccgg cggcaccggc ggcctggccg gcctgctcgc ccgccacctg 3 9660 

gtcaccgagc acggcgcccg ccacctgctg ctggccggcc ggcgcggccc cgacgcgccc 39720 

ggcgccgccg cactccacgc cgaactgacc gccctgggcg ccgaggtcac cgtcgccgcc 3 9780 

tgcgacgtcg ccgaccgcac cgcgctcgcc gcgctgctcg ccaccgtgcc cgccgaacac 3 9840 

cccctcaccg cggtcgtgca caccgccggc gtcctggacg acggcaccct caccgccctg 39900 

aaccccgacc gcctcgccac cgtcctacgg cccaaggtgg acgccgcctg gcacctgcac 3 9960 

gacctcaccc gccacctcga cctggccgcg ttcgtgctct actcctccac cgccggcgtc 40020 

atgggcggac cgggccaggc caactacgcg gccggcaaca ccttcctcga cgcgctcgcc 40080 

gcccaccgac acgccctcgg cctgcccgcc acctcgctgg cctggggcgc ctgggagcag 4 0140 

ggcgccggca tgaccggcgc actgaccgac cacgacctgc gccgggtcag cgacgccggc 40200 

ggccaaccgc tgctcaccgc cgaacgcggc ctcgccctct acgacgccgc caccgccgcc 40260 

gacgaacccc tgatcgtccc gctcggcctc accggcggtg cgctgcccgc cggggtcggc 40320 

gtccccgccg tgctgcgcgg cctggtccgc accgcgggcc gccgggccag ggccggcacc 40380 

gccggcgt'ct cccgcgccgg cctcgccgaa cgcctcgccg ccctgcccga ggaggagcgc 40440 

acccccttcc tcgtcgagct ggtgcgcacc gaggccgcca ccgtcctcgg ccacggctcc 40500 

accgacccgg tggacgcccg ccgcgagttc cgccaactcg gcttcgactc gctgaccgcc 40560 

atcgaactgc gcaaccgact cggcaaggcc accggcctca ccctgcccgc caccctcatc 40620 ! 

ttcgactacc cgacccccga ccgcctcgcc gtccacctcc acgacgaact cctcggcgcg 40680 

gacgccccgg tgaccgtcac cgccgccgca caggccgcgg acccggagca cgacccggtc 40740 

gtcatcgtcg gcatgagctg ccgcttcccc ggcggcgtca gctcccccga ggagctgtgg 40800 

gacctggtgg catccggcac cgacgcgatc accggcttcc ccgccgaccg cgcatgggac 40860 

cgccacccgc agctcgccgg cgcccccggc gcccgcaccg gccagggcgg attcctccgc 40920 

gacatcgccg acttcgacgc cgccttcttc ggcatctcgc cgcgcgaggc cctggccatg 40980 

gacccgcagc agcgcatcct cctcgaagtc gcctgggagg ccgccgagcg cgccggcatc. 41040 

gacccgcaga ccctgcgcgg cagcgacacc ggcgtgttca tgggcgtcag cggccaggac 41100 

tacgccggcc tcgtgatgcg ctcccgcgac gacatcgccg gccacgccac caccggcctc 41160 

gccgtcagcg tcgtctccgg ccgcctcgcc tacgcgctcg gcctggaggg cccggccctg 41220 

tccgtggaca ccgcctgctc ctcctccctg gtgtcgctgc acctggccgc ccaggcgctg 41280 

cgcgcggggg agtgcaccat ggccctggcc ggcggcgtca ccgtcatgac caccgccgcc 41340 

aacttcaccg gcttctcccg gatgggcggc ctcgcccagg acggccgctg caaggcgttc 414 00 

tccgactccg ccgacggcac cggctggtcc gagggcgccg ccgtcctggt cctggaacgc 414 60 

ctctccgacg cccggcgcgc cggccaccgc gtactggccg tggtgcgcgg ctcggcggtc 41520 

aaccaggacg gtgcgtccaa cggtctgacg gcgcccaacg gtcccgccca gcagcgcgtc 41580 

atccggcagg ccctggccaa cgcgggcctg acccccgtcg acgtggacgc cgtcgaggcg 41640 

cacggcaccg gcaccccgct cggcgacccc atcgaggccc aggccctgat cgccgcctac 41700 

ggcaccgacc gcgaccccga acacccgctg ctgctcggct cggtgaagtc caacatcggc 41760 

cacacccagt ccgcggccgg cgcggccggg ctggtcaaga tggtcatggc catgcgccac 41820 

ggcatcctgc cgcagaccct gcacctcacc gaaccgtcct cgcacgtgga ctggtcggcg 41880 

ggcacggtgc ggctgctcac cgagcggacc gcctggccgc ggacggatcg tccgcgtcgg 41940 

gccggggtct cctcgttcgg catcagcggc accaacgccc acgtcatcct ggaacagccg 42000 

cccgccgagc ccacccccgc cgccgacccg ggccggcccg cacccaccgt ggtggcctgg 42060 

cccgtctccg cgcagacccc ggccgccctc gacgcccaac tggaccggtt gcgcaccgcc 42120 

gccgccctgg cgccgctcga caccgcccac accctcgcca ccggccgctc gctc'ttcgaa 42180 

caccgcgccg tcctgctcgc caccgtcggc gacccggcga ccggcgcccc cgacctgccc 42240 

gaggtcgcca ggggagcggc gacgccgcac cgcaccgcgt tcctcttctc ggggcagggt 42300 

gctcagcggt cggggatggg gcgtgaactg catgctgctt tcccggtgtt cgcggcggcg 42360 

ttcgacgagg tggtggctgt gttggatgcg gagctgggtt ctgatgctga tgggggtgtg 42420 

tcgctgcggg aggtgatgtg gggcgggggg tcggagttgt tggatcgaac gcgtttcacg 42480 

cagccggcgt tgttcgcggt ggaggtggcg ttgttccgtt tggtggcctc gtggggggtg 42540 

gggcctgagt tcgtggcggg gcattcggtg ggtgagattg cggcggcgca tgtggccggg 42600 
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gtgttctcgt 
ttgccggtgg 
ttggtcgatg 
gtggaggcgg 
cggttggcgg 
C99gccgtcg 
acgggcgagg 
cgggcgaccg 
ttcctggaga 
gaggcgctcg 
ggactggccc 
accggcgcgc 
gagttggccg 
gcggtggagc 
ctcggcgccg 
accaacgccc 
accggcggcg 
gtcgtcgccg 
cgggccgcgc 
gtctccctgc 
gccaccgccg 
ctgacccggg 
gccgtctggg 
gtcgacctgc 
gacgccggcg 
accccggcgc 
ctgatcaccg 
ggcgccgagc 
ctcaccaccg 
gaccgggagc 
gtggtgcaca 
ttcgccacgg 
gacaccgacc 
ggccaggccg 
gcccagggcc 
gcccgtcaca 
ccggccctgg 
cagccgcggc 
cccgccgccc 
gcggccgacc 
ctccgcctgg 
cgggccgaca 
agcgctctcg 
tccccgcggg 
gcccccgcgg 
ggcatggcct 
gccgagggcc 
ctcggccggc 
gcggccgcct 
ccgcagcagc 
ccgacccggc 
gccggcctcg 
gccagcgtga 
gtcgacaccg 
gcgggggagt 
ttcgccggct 
gactccgccg 
tccgacgcgc 
caggacggtg 
cggcaggccc 



tggtggatgc 
gtggcgtgat 
gggtggcgat 
ccgttgggca 
tcagtcatgc 
ccgagggcct 
tggccgcggc 
tgcggttcgc 
tcggccccga 
tgacgcccac 
ggctgcacgt 
gcggcaccga 
ccgaacccgc 
gcgcggacgc 
tgctgcccgc 
tgcggcaccg 
tcctggtgct 
ccctcggccc 
tcgccgccct 
tcgcgctcga 
cgctcgtcca 
gcgcggtcgc 
gcctcggccg 
ccgccgacct 
acgaggacca 
ccgcccccga 
gcggcaccgg 
acctgctgct 
aactcaccgc 
agctgaccag 
ccgccggagt 
tcttccgcgc 
tggccgtctt 
gctacgccgc 
tggccggcac 
cccgccccgg 
cccgcgcggt 
tgctggaatc 
gcaccgcggc 
tgcgcgacca 
tgcggaccac 
agcccttccg 
ccgccgccac 
cgctcgccga 
ccccgccggc 
gccgcttccc 
gggacggcat 
gacggccagg 
tcgaccccgg 
ggctgctgct 
tgcgcggcag 
tcctgcgcgc 
tctccggccg 
cctgctcctc 
gctccctggc 
tcacccggca 
acggcaccgg 
tccgcaacgg 
cgtccaacgg 
tggccaacgc 




gtgtcgtttg 
ggttgcggtg 
cgccgcggtc 
ggtcgtggat 
tttccactcg 
ggagtaccac 
ggaggagctg 
cgacggcgtc 
cggcgtactg 
cctccgcaag 
cgccggcgtg 
cctgccgacc 
gggcggcggc 
caccgcgctc 
actgtccgcc 
ggagagttgg 
ggtgcccgcc 
ggacgcccgc 
gctcaccgaa 
cgagaccagc 
ggccctcgcc 
cgcgctcccc 
gatcgccgcc 
cgacgagcgc 
gctcgcgctg 
cgacgccccc 
cgcgctcggc 
gctcagccgc 
cctcggcgcc 
ggtcctcgcc 
gctcgacgac 
caaggtggcc 
cgcgctgttc 
cgccaacgcg 
ctcgatcgcc 
cgccgaaccc 
gacggagccc 
cctgctggcg 
ccgcgcggtc 
actcgccggc 
ggccgccgcg 
cgacctcggc 
cggcctggcc 
ccacctgcgc 
accggtcccc 
cggcggcgtc 
cgacgcgttc 
gccacagcgc 
cttcttcgac 
ggagaccgcc 
ccgcaccggc 
ccaggaggac 
cctcgcctac 
ctcgctcgtc 
cctggccggc 
gggcggcctg 
ctggtccgag 
ccatgagatc 
tctgacggcg 
gggcctggcc 



gtggtggcgc 
gaggcggccg 
aacgggccgg 
cagttggtgg 
ccgttgatgg 
cagccgcgca 
tgcgcggccg 
cgcaccctgg 
tccgcgctcg 
gaccgcgacg 
accgtcgact 
tacgccttcc 
gcggatgccg 
gccgcccacc 
tggcgcaccc 
gaaccgctgt 
gccgcgacca 
cgggtggacg 
gcggccgacg 
ggcgacgacg 
gacaccggcg 
gacgagcagc 
ctcgaactcc 
accgcacgcc 
cgggccaccg 
ggcaccgggt 
cggcacaccg 
agcggccccg 
cgcgtcaccc 
gaggtaccgc 
ggcgtgctca 
tccgccgtgc 
tcgtccgtcg 
gtcctcgacg 
tggggtgcct 
gtcggcctgc 
cagcccaccc 
ctgcgcccca 
caggaagcgg 
accgcacccg 
gtcctcggcc 
ttcgactcgc 
ctcccgccca 
gccgaactca 
gccgcggacg 
accacccccg 
cccaccgacc 
ccaccgaggt 
atctccccgc 
tgggaggccg 
gtgttcgtcg 
gtcgaggggc 
gccttcggct 
gccctgcact 
ggcgtcaccg 
gcgccggacg 
ggtgtgggcg 
ctggccgtgg 
cccaacggtc 
cccggcgacg 



gggcttcgtt 
aggcggaggt 
tctcggtggt 
agcggggccg 
atccgatgtt 
tccccgtggt 
actactgggt 
ccgagcgcgg 
cccgcggcgt 
aggagagcgc 
ggagcgccgc 
aacgcgagcg 
cggacgcgga 
tggacatcga 
ggcgccgcac 
cgctcgccgg 
ccgacccctg 
tcccggccga 
acaccgcccc 
cggtaccggc 
ccccggcccc 
cgaccgcccc 
cgcgccactg 
gactgcccgc 
gcgcctacgg 
ggcagccgac 
cccgctggct 
acgcgcccgg 
tcgtggcctg 
gggactgccc 
ccggcctcac 
tcctcgacga 
cgggcgctgt 
ccctcgccgc 
gggccggcga 
tcgaccccga 
tcgtcctcgc 
gcccgctcct 
accgccgccg 
ccgaccgcca 
acaccggtgc 
tcaccgcggt 
gcctcgtctt 
ccggcgaccg 
acgatccgat 
aggagttctg 
gcggctggga 
cggcggcttc 
gcgaggcgct 
tcgaacgcac 
gcaccaacgg 
acgccggcac 
tcgagggccc 
gggccgtcca 
tcatgacgac 
ggcactgcaa 
tcctcgtcgt 
tgcgcggctc 
ccgcccagca 
tggacgcggt 
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gatggatgcg 
ggtgccgctg 
ggtctccggt 
gcgggtccgt 
ggatgccttc 
gtccaacgtg 
gcggcacgtc 
cgccaccgcc 
cctgcccgcc 
cctgctcgcc 
cctgaccggc 
gtactggccg 
gttctgggcc 
cggcgaccag 
cacatcggcc 
cacgccgcac 
ggtcgccgac 
cggcaccgac 
gaccgccgtg 
cggcaccacc 
gctgtgggcc 
cgcccaggcc 
gggcggactg 
cgcactggcc 
ccgccggatc 
cggcaccgtc 
cgccgcccac 
cgcggccgaa 
cgacgccgcc 
gctgaccggc 
cccggaccgg 
gctgacccgg 
cggcaacccc 
ccgccgccgg 
cggcatggcg 
cctcgccgta 
cgacctccag 
gagccggctg 
agccggcgcc 
cgccgtcctc 
cgacgccatc 
ggaactgagc 
cgaccacccc 
gccggaatcc 
cgtcgtcgtc 
gcagctgctc 
cctcgacgtg 
ctcttacgac 
cgccatggac 
cggcaccgac 
ccaggactac 
cggactggcc 
cgccgtcacc 
ggcgctgcgc 
ctcgacgagc 
ggcgttctcc 
cgaacgccgc 
ggcggtcaac 
gcgcgtcatc 
cgaggcccac 



42660 
42720 
42780 
42840 
42900 
42960 
43020 
43080 
43140 
43200 
43260 
43320 
43380 
43440 
43500 
43560 
43620 
43680 
43740 
43800 
43860 
43920 
43980 
44040 
44100 
44160 
44220 
44280 
44340 
44400 
44460 
44520 
44580 
44640 
44700 
44760 
44820 
44880 
44940 
45000 
45060 
45120 
45180 
45240 
45300 
45360 
45420 
45480 
45540 
45600 
45660 
45720 
45780 
45840 
45900 
45960 
46020 
46080 
46140 
46200 
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ggcaccggca ccgtcctcgg cgaccccatc gaggcccagg cgctgctcgc cacctacggc 46260 

caggaccggc ccgccgaccg gccgttgtgg ctcggctcgg tgaagtccaa catcggccac 46320 

acccaggccg ccgccggcgc cgccggcctg atgaagatgg tgctggccct ccaacacggc 46380 

acgctgccgc gcaccctgca cgtcaccgag ccctcgaccc gggtcgactg gtcggccggc 46440 

gcggtgcggc tgctcaccga gcggaccgtc tggccgcgga cggatcgtcc gcgtcgggcc 46500 

g999tctcct cgttcggcat cagcggcacc aacgcccacg tcatcctgga acagccgccc 46560 

gccgagccca cccccacggc ccctgccgac cgccccaccc ggacgcccgc cgtcctccca 46620 

tgggtcgtct cggcccgatc ggccaccgcg ctcgacgcgc agctcgcgcg actgcgggcg 46680 

ttcgccgccg agcgcccgga cctgccgccc gccgacgtcg cccactcgct cgtcaccagc 46740 

cgcgccacct tcgaacaccg ggcggtcctg ctggccgcgc ccgacggcat caccgcggcc 46800 

gcccgcgccg aggcccgcga acgcagcacc gcgttcctct tctcggggca gggtgctcag 46860 

cggtcgggga tggggcgtga actgcatgct gctttcccgg tgttcgcggc ggcgttcgac 46920 

gaggtggtgg cggtgttgga tgcggagttg gcgacgggtt ccggtggggg tgtgtcgctg 46980 

cgggaggtga tgtggggcgg ggggtcggag ttgttggatc ggacgcgttt cacgcagccg 47040 

gcgttgttcg cggtggaggt ggcgttgttc cgtttggtgg cctcgtgggg ggtggggcct 47100 

gagttcgtgg cggggcattc ggtgggtgag attgcggcgg cgtatgtggc cggggtgttc 47160 

tcgttggtgg atgcgtgtcg tttggtggtg gcgcgggctt cgttgatgga tgcgttgccg 47220 

gtgggtggcg tgatggttgc ggtggaggcg gccgaggcgg aggtggtgcc gctgttggtc 47280 

gatggggtgg cgatcgccgc ggtcaacggg ccggtttcgg tggtggtctc cggtgtggag 47340 

gcggccgttg ggcaggtcgt ggatcagttg gtggagcggg gccggcgggt ccgtcggttg 47400 

gcggtcagtc atgctttcca ctcgccgttg atggatccga tgttggatgc cttccgggcc 47460 

gtcgccgagg gcctggagta ccaccagccg cgcatccccg tggtgtccaa cgtgacgggc 47520 

gaggtggccg cggcggagga gctgtgcgcg gccgactact gggtgcggca cgtccgggcg 47580 

accgtgcggt tcgccgacgg cgtccgcacc ctggccgagc gcggcgccac cgccttcctg 47640 

gagatcggcc ccgacggcgt actgtccgcc ctggccgcgg cctgcctgtt cgacacggac 47700 

gccgaagtgg tgcccgcgct gcgcaagggg cgccccgagg agcacaccgc cctcaccgcc 47760 

gccgcccaac tccacgtggc cggcgtggac atcgactgga ccgcggtcct ggccggcacc 47820 ' 

99C999cggc ggatcgccct gcccacctat gccttccagc gcgagcggta ctggccctcg 47880 

ctcgccgcac aggcccccgg cgacgccggc gggctcggcc tggaagccgg gcggcacccg 47940. 

ctgctcgggg ccgcgaccac cgtcgccgga tccgcggaga tcctgctcac cggccgcctg 48000 

tccaccaccg cccagccgtg gctcgcggtc tacgaggcgg acggccgcac cgtcctgccg 4 8060 

gccgcggtcc tcgccgaact cgccgtccgc gccggcgacc aggccgactg cccgaccgtg 48120 

gcggaactga ccgtcgccgc accgctcgtc ctcaccggcg cggcggccca gcgcctccag 4 8180 

gtccgggtgg ccgcccccga cgacaccggc cggcgcgcgc tgtccgtgca cgcccgaccc 48240 

gacgactccc ccgacagccc ctggacgctg cacgccaccg cggtcctcac ccacgacacc 48300 

ccgcagcccc cggcgccgga caccggctgg ccgccggagc gcgccgtgcc gctcgacgcc 48360 

ctgcccaccg ccaccggccc ggcccggatc gcggcggcct ggcagtgggg cgacgaactc 48420 

tgcgccgaga tcgaactccc cgaacccggc ccggcggagc gggcattcgc cctgcacccg 48480 

gcgctgctgg acaccgcggt ccgcgccggc ggcctgctgg acggcgacgc caccctggac 48540 

gccctcggct ggcggggcct cgccctgcac gccgcgtccg ccaccgccct gcgggtccgc 48600 

ctcaccccgg acggcacgga cacctgggct ctggaggcca ccgacccgca gggcgctccg 48660 

gtcgtctccg tcaccgggct caccctgggc acgcccaccg tcgaccggtc gggggccggg 48720 

gcggccgatg acggcgcgac cctgctcgac ctggagtggg tgcccgcgcc gcaggccgcg 48780 

cccaccggcg' gcgaccacct cccgtacgcc gtgctcggcg atcaactcgc ggagctggac 48840 

gggcagttga ggatcgcggg cgacgggccc gggcgcgtcg catcgctggc cgcgctgctg 48900 

gacggcggtg cgccgctgcc ccggctcgtc ctcgcgccgg tgctgggcgt gccgaccggg 48960 

gaaggcgacc tgcccgccgc ggtgcgcggc accaccacgg cggtgctgga gctgctgcag 4 902 0 

cgctggaccg ccgacgcccg caccgccgac agccacctgg tgatcgtcac ccgcggcgcc 49080 

gtcgccgccg gggcggagga cgtgcacgac ctggcggcgg ccccggtctg gggcctggtc 49140 

cgctcggcac agtccgaaca ccccggcagc ttcctgctgc tcgacctcga ccccgccgat 4 9200 

cccgcgggag cctcccgcgc cgccgcgccg gccaccctgg cggccctgct cgacgcgggc 4 9260 

gagacccagg ccgcggtgcg cgccgacacg ctcaccgtcg cccggctgac ccgggccgcc 4 9320 

gacggacccg aggccaccgc cggacacccg gtgcgggact gggaccgcga cggcaccgtc 4 9380 

ctgatcaccg gcggcaccgg cggcctgggc ggcctcctgg cccgccacct ggtcaccgga 49440 

cacggcatca agcacctgct gctcgccggg cgccgcggcc cggacgcccc cggcgcgcgg 4 9500 

gccctgcgcg acgaactggc cgccctcggc gccgaggtga ccgtcgccgc ctgcgacgtg 4 9560 

gccgaccgtg ccgcactgga ccgactcctc gcgcaactgc cgccggagca cccgctgacc 4 9620 

gccgtcgtgc acaccgccgg cgtcctcgac gacgccaccg tcggcaccct gacgcccgag 4 9680 

cggctggaca ccgtcctgcg cgccaaggcg gacgccgcct ggcacctgca cgacgccacc 4 9740 

cgcgaccgcg acctggcagg gttcgtgctg tactcctcgg tcgccggtgt caccggcggc 4 9800 
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cccggccagg gcaactacgc cgccggcaac acgttcctcg acgcgctcgc cgcgcaccgc 4 9860 

gccgcccagg gcctgcccgg actgtcgctg gcctggggac cgtgggggca ggacgccggc 49920 

atgaccggca ccctcggcgc cgccgacctg gcccgcctgg agcgctccgg catgccgccg 49980 

ctcaccccgg aacagggcct ggccctgttc gacgccgccg gcgcccgcgg cgacgggttc 50040 

gcggtggcgg tgcggctcgc ccgtggcgcc gccgcaccgg gcgccgacga ggtccccgcg 50100 

gtgctgcgtg ccctggtgcg cggccggcgc cgcacggcgg ccgcggccgg gcacgccggt 50160 

gtactggccc gccggctggc cgccctggac gccgagcagc ggcatcaggc gctgctcgac 50220 

ctggtccgca ccgagacggc cgcggtgctc ggccactccg gggcggacgc cgtcccggcc 50280 

gagcgggact tcaaccggct gggcttcgac tcgctgatgg cggtcgaact gcggacgcgg 50340 

ctggccaccg ccaccggagc ccggctgccg gccacgctcg tcttcgacca cccgacgccg 50400 

gacgcggtcg cccggcacct cgcgtcgacg ctgcccggtg ggaccgcggc cggtccggac 50460 

cgttccccgc tggccgaact cgaccggatc gccgccgagt tgtcgccgga gggcgcggac 5052 0 

gacgccaccc gacagggcgt cgtcgggcgg ctgcggcacc tgctggcgca gtgggacggc 50580 

acccgacagg acggcggtgg gacgaccgtc gacgaccgca tcgaagcggc gagcgccgaa 50640 

gaggtcctcg ccttcatcga ccacgagctc ggccggcagg cggactcctg acccgcccca 50700 

ctcccgtcgc tcgcgcgcac cacatctgag gaaggtttca cggaccatgc cggacgaaaa 50760 . 

gaagctcgtc gactatctga agtgggtcac gaaggacctc caccagaccc gccagcgcct 50820 

tcaggaggtg gaggcggggc gccacgaacc cgtggcgatc gtcggcatgg cctgccgctt 50880 

ccccggcggt gtgcgctccc cggaggacct gtgggagctg ctgtccgcgg gccgggacgg 50940 

catcgggccg ttccccgccg accgcggctg ggacctggcg gcgctggccg gcgacgggcc 51000 

cggtcgcagc gccacccagg aaggcgggtt cctgcccgac gcggccgcct tcgacccggg 51060 

cttcttcgac atctccccgc gcgaggcgct cgccatggac ccgcagcagc ggctgctgct 51120 

ggagaccgcc tgggaggccg tcgaacgctc cggcatcgac ccggccgggc tgcgcggcag 51180 

ccgcaccggc gttttcgtcg gcaccaacgg ccaggactac gcgcacctgg tcctcgccgc 51240 

gcaggacgac atgggcggct acgcgggcaa cggcctggcc gccagcgtgc tctccggccg 51300 

actggccttc gcgctcggcc tggaaggccc ggccgtcacc ctcgacaccg cctgctcctc 51360 

gtcactggtg accctgcacc tggccgcaca ggccgtgcgc gccggcgaat gcggcctcgc 51420 ' 

cctggccggt ggcgtcacgg tcatgacgac ctcgtcgagc ttcgccggct tcagcctcca 51480 

gggcggcctg gcgccggacg gccgctgcaa ggcgttcgcc gaggcggccg acggcaccgg 5154 0 

ctggtccgag ggcatcggcc tgcttctcgt cgagcggctc tccgacgcgc agcgcaacgg 51600 

ccacccggtg ctcgccgtgc tgcgcggctc cgccgtcaac caggacggcg cgtccaacgg 51660 

cctcagcgcg cccaacggtc cgtcccagca gcgggtcatc cgccaggcgc tggccggcgc 51720 

cggactcgtc cccggcgacg tggacgcggt cgaggcgcac ggcaccggca cccggctcgg 517 80 

cgaccccatc gaggccggtg cgctgctcgc cacctacggc caggaccggc ccgccgaccg 5184 0 

gccgttgtgg ctcggctcgg tgaagtccaa cctcggccac acccaggccg ccgcgggcgt 51900 

cgccggcgtc atcaagatgg tgctggccct gcggcatggc gtcctcccgc agaccctgca 51960 

cgtggacgcg ccctcctcgc acgtcgactg ggagagcggc gcggtgcggc tgctcaccgc 52020 

acccgtcgcc tggtccgagg gcgacgaccg ggtgcgccgg gccggcgtct cgtcgttcgg 52080 

catcagcggc accaacgccc acgtcatcct cgaacaagcc cccgatcagc cggaaccgac 5214 0 

cgcggaagag acggctgccg cggcgcccgg cggcaccgcc gaggagcggg ccgccgctcc 52200 

cgtcgccccg cgcgccgtgc cgtggccggt cgcggcacgc accgccggcg ccctcgacgc 52260 

ccaactggtc cgggtccgcg cgctgaccac cgcgcccggc cgcaccgccg ccgacgtcgg 52320 

tcacgcgctg gccaccgccc gtaccccctt cgagcaccgg gcgctgctgg tccacgaggg 52380 

cggcgccgtc accgaggtgg cgcgcggcgc cgtccccacc ggtgaccggg gcgggctggc 52440 

cgtgctgttc tccggacagg gctcccaacg gccgggcatg gggcgcgaac tccacgcccg 52500 

ctacccggtc ttcgccgccg ccttcgacga gaccgtcgcc ctgctcgacg cccggctcgg 52560 

' cacgtcgctg cgcgacatcg tctgggacca ggaccgcacc cggctcgacg acacccgcca 52620 

cacccagccc gcgctgttcg ccgtcgaggt cgcgctgtac cgcctgctgg cctcctgggg 52680 

catccggccc gaccacgtca ccggacactc catcggcgag atcaccgcgg cgcacgtcgc 52740 

cggtgtgctg accctcgcgg acgcctgcac cctggtggcc gcccgcgcca ccgccatgag 52800 

cgaactgccg cccggcggcg cc'atggtggc gctggaggcc accgaggacg aggtgcgtcc 52 860 

gctgctcacc gacgacctcg cgatcgccgc ggtcaacgcc ccccggtccg tggtcgtcgc 52920 

• cggcgccgag gacgccgccc tcgccgtccg ccggcacttc gacgacctgg gccgccggac 52980 

cacccggctc ccggtcagcc acgccttcca ctcgccgctc atggacccga tgctcgacgc 53040 

cttccggacg gccctcgccc cgctgacctt cgccgagccg gagatcccgg tcgtctccaa 53100 

cctcaccggc ctcccggcca ccgccgagga actcgccacc ccgcactact gggtgtgcca 53160 

cgtccggcag gccgtccgct tcggcgacgg cgtgcgcgcc ctcgccgacc gcggcgtgcg* 53220 

gaccttcctc gaactcggcc cggacggcgt gctgtccgcc ctggtccggg agaacctccc 53280 

cgagccgggc ctggtcgccg tgcccgtgct gcgcaaggag cggcccgagg agaccaccgt 53340 

gctggccgcc ctgggaaccc tgtgggcgca cggcgcggac gtggactggg acgcggtgtt 53400 
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cgccggcacc cgcaccccgc aggccgaccc cgtcgagctg ccgacgtacg ccttccaacg 53460 

cgcccgctac tggcccaccc tcggcgcccg ccacggcgac ccggccgacc tcgggcagac 53520 

cgccgccgcc cacccgctgc tgggcgccgc cgtcaccctc gccgacgccg acgagaccgt 53580 

gctcaccggc cgcctcgcgc tgccctccca cccctggctc ggcgaccacc gcagcgacgg 53640 

ccggatcacc gtccccggcg tcgccttcgc cgaactcgcc gtccgcgccg gcgacctgag 53700 

cggcaccccg cacctggcgc ggctcgacct gccggcgccg ctcaccctcg gcgacggcga 53760 

caccgtcacc ctccaggtcc gggtcggcgc ccccgacccc gcgggccacc ggccgctgac 53820 

cgtccacgcc cgcctcgcag ccaccgagga cgccccctgg accacctgcg cgaccggtct 53880 

gctcgccccg gacgcccccg aagcgcccgc ggatccgatc ggcccggccg acgccgggtg 53940 

gccgccgcgg gacgcccgcc cggtgcccgt cgccgacctc gacgcggccg ccaccgccgc 54 000 

aggccgccac tacggccccc atttccaggg cctgaccggg ctctggcggc gcgacggcga 54060 

ggtcttcgcc gaggtggccc tgcccaccgc caccgccgcc gaccgcgcct tcggcatcca 5412 0 

ccccgcgctg ctggccaccg cgctccgcgc caccgccgca ctggacgacg accacaccgc 54180 

cggccacacc cccgaaccga ccggcatcac cggactcgcc ctgcacgcca ccggggccac 54240 

cgcactgcgg gtccggctca ccgcgaccgg gcccgacacc gtggccctcg ccgccgcgga 543 00 

cgccacgggc ggcgcggtcc tgaccgccga caccgtcacc ctcggctccc cgcaggaccg 54360 

cccggctccc gcaccggccg gccacaccgg gcagggcggc ctgttccacc tcgactgggt 5442 0 

gccggtcgac cccggcagcc gagccaccgg cacccgctgg gccgtcgtcg gcgacgacga 54480 

actcgacctc ggctacgccc tgcaccgcgc cgacgagacg gtcagtgcct acgcggcgtc 54540 

gctgggcgga gccatcggcg acagcggtct ggcgcccgac gtcttcctcg tccccgtcgt 54600 

cggcggcccg gacgccgggc ccgacgcggt gcacgccgtc accgcccgcg ccctggggct 54660 

gctccaggag tggctgaacg agccgcggtt ggccggcgcc cgcctggtct tcgtcacccg 54720 

cggcgccgtc gcggtgcccg gcgagaccgt caccgacccg gccggcgccg ccgtctgggg 54780 

cctgctgcgc tccgcccaga ccgagaaccc gggcagtctg ctgctggtcg acctcgacga 54 840 

cgcgttccgg tccgccggga tgctgccgca cgtcctcacc ctcgacgaac agcagctcgt 54900 

cgtccgcgac cacgcggtcc gcgccgcccg cctggcccgg ctgccggagc cggccgccgg 54 960 

caccgcgccg gcccgcgcct gggacccgga cggcaccgtc ctgatcaccg gcggcaccgg 55020 

cggcctgggc gccgcgctcg cccgccacct ggtcaccgtc cgcggcgccc gccacctgct 55080 

gctcgccggc cgccgcggcc ccgaggcgcc gggcgccggc gaactggtgg cggagctgac 55140 

cgcacagggc gcggacgtgc gggtggccgc ctgcgacgtc ggcgaccgca ccgccctcga 55200 

cgcgctcctc gccacggtcc ccgcggcgca cccgctgacc gccgtcgtgc acaccgccgg 55260 

cgtcctggac gacgccctga tcggctcgct cacccccgac caactggcca ccgtgctacg 55320 

gcccaaggcc gacgccgcct ggcatctgca cgacgccacc cgcggcctcg acctggccgg 55380 

cttcgtcctg tactcctcgg tctccggcgt cctgggcagc cccggccagg gcaactacgc 55440 

cgccgccaac gcctacctcg acgcgctcgc ccggcaccgc gccgaccagg gcctcccggc 55500 

gctctccctc gcctggggcc cctggggtcg gggcagcggc atgaccgcgt cggtcagcga 55560 

cgccgacctg gagcggatgg cgcgcggcgg cctgccgccg ctgaccgtcg aggacggcct 55620 

ggccctgttc gacgccgccg tcggccgccc cgagccggcc ctggtgccca gccgcatcaa 55680 

cgtcgccggc ctgcgggacc agcaggcact gccggcactc tggcgcgacc tggtaccgcg 55740 

ggcccgccgc accgcggcca ccgccgaccg ctccccggtc acggtgcgcg agcgcctccg 55800 

ccacctcgac gagaccggcc aggagcagct gctcatcgac ctcgtcgtcg gctacaccgc 55860 

cggcctgctc ggccaccccg accccaccgc cgtcgatccc gaacggggct tcctggagct 55920 

gggcttcgac tccctggtct cggtcggcct gcgcaaccag ctcgccgaga tcctcggcct 55980 

gcgcctgccc tcgtccatcg tcttcgacag caagtcgccg gtgaagctgg cgcgttggct 5604 0 

gcaccaggaa ctcgccaacg gcccccagcc gggcgccacc ggccccgccg ccgcggacgc 56100 

ccgtcccgcg gtgcgctccg acgacaccct ggagggcctg ttctacaacg cggtgcgcgg 56160 

cggcaagctc gtcgaggcga tgcggatgct caaggccgtc gccaacaccc ggccgatgtt 56220 

cgacaccccc gccgagctgg aggagctctc cgagccggtg acgctcgccg acggcccggg 56280 

ccggccccgg ctgatcttcg tcagcgcccc gggcgccacc ggcggcgtcc accagtacgc 56340 

gcgcatcgcc gcgcacttcc gcggcagccg ccatgtctcc gcgctgcccc tgatgggctt 56400 

cgcccccggc gagctcctcc cggccaccag cgaggccgcg gcccgtatcg tcgccgagag 56460 

cgtcctgatg gccagcgagg gcgaaccgtt cgtcatggtc ggccactcca ccggcggctc 56520 

gctggcctac ctcgccgccg gcgtcctgga ggacacctgg gacgtccggc ccgaagcggt 56580 

ggtcctcctc gacaccgcgt' ccatccgcta caaccccggc gagggcaacg acctggaccg 56640 

caccacgagg ttctacctgg ccgacatcga ctcgccctcg gtgacgctca acagcgcccg 56700 

gatgtccgcc atggcccact ggttcatggc gatgaccgac atccaggcgc ccgcaccgac 56760 

cgcccccacc ctcctcgtgc gcgccgcccg ggccctcgac ggcttccggc tcgacacctc 56820 

gtccgtcccc gccgacgagg tccgggacat cgacgccgac cacctctccc tcgccaagga 56880 

gcactcggca ctgaccgcgc aggccatcga gggatggctc gcggaactgc cggaccccgc 56940 

ggcctgatct ccggcccggc cggcccccga caggccgggc ccatcccccc accggggcgg 57000 
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cgccacccag 
cccgcatggc 
cctgaaagcg 
ccgcgccccc 
gcacaccgaa 
ggccaacgcc 
cgacctggcc 
ccgccgcgtc 
cttcgtcgcc 
gtcggtgctg 
cctcaccaaa 
gttcggcctg 
ctcccggctc 
gctcttcgcc 
ccagcacccg 
cgaggagatc 
cgatgtaccg 
cctggtgaac 
ccccaacccg 
ccgggtcaat 
99tgcgcccg 
gctgcccgtc 
agtgagggcc 
gtccgccgcg 
gggatccgcg 
gaacaggtcc 
ggtgatgcgc 
99cggcccga 
cgtcgtactt 
cgaagagcgt 
ggtggacacc 
agggcgccgc 
cgccggcccg 
ccagatcggc 
accgcgggtt 
gcccggcgat 
cggcgatcag 
ggcccagata 
gccggaccag 
gcagactgga 
aggtgcccgc 
gcagttcccg 
gctccggtgg 
cggtccgacc 
tcagcaccgc 
cggagcggac 
ggcccggcct 
cttcggtgct 
ctgtgcggtc 
gttcaacgcg 
gacccgggtc 
ccgcaggtcg 
gcgcagcgcc 
ctgctcggtc 
cagcttcttg 
ctgctcggcc 
cttgttggcg 
ggcggagtcc 
gtcgcactgc 
ctcctcgatg 



gtgcgccccg 
accaccctcc 
gaggtgccgc 
gtctgcaagg 
ctcaagcagc 
ccgcgttatg 
cgcacgctgc 
atggacctga 
cagggaccgc 
tgcgcgctca 
ctgggcgaac 
ctgtccggcc 
tgcctgaagg 
ggcctggaca 
gaccagctcg 
ctgcggtccg 
atcggcgacg 
ttcgaccgca 
cacctgacgt 
ctgcgcaccg 
gtcgaggaac 
acctggtgac 
cccgccccgc 
acgctgatcg 
gcgtccgccg 
ggtgcggtca 
acgatcgttc 
ccgggctcac 
caacggcagt 
gcggtaggcg 
gtagccgaac 
gaaggccgtg 
gatcagctga 
caccgagtag 
ggtgagcaac 
cagcagcagc 
ccggctgatc 
gcgcagcagc 
cacccgggtc 
gatcaccagg 
ggccagcatc 
caccttgcgc 
gtccatcgcc 
ggccttgggg 
ccggacgtcc 
caggcccgcc 
gcgcagcggg 
catgccactc 
agtccggtgt 
ggccagtcgg 
tcgatgccct 
tacatccaca 
tcgtcgtagc 
tgcgcgagcg 
tgcaggaagc 
aggtgcgggt 
aagagcgaga 
tccaccaccc 
cggccgtaga 
cgggacacgt 



gcgcacccgg 
agaggagtcc 
ccgtcctgcg 
tccgcacccc 
tgctgcacga 
tgcacaaccc 
acgccgagat 
cgccgagggt 
ccgccgacct 
tcggcgtccc 
tcgacgaccc 
tggcacgccg 
tgccctccga 
gcgtcgccag 
ccgcggccct 
ccaaggccgg 
tgaccatcag 
cggtcttcga 
tcggccacgg 
cctacaccct 
tgcgggtgct 
gtgatgtcgg 
tcggccgggg 
ccccggacgg 
ccccggccag 
ggacgcactg 
ctccgagggt 
ttccaggtga 
tcctccaccg 
atgtccatct 
gccacgtgcg 
gcgtcgtggt 
ccgccgatct 
tagcgcagca 
tgcaccacgc 
atcgccaccc 
aggtcgtcac 
gcctgggtcg 
cggtcctcga 
gacggcaccg 
gcgtcgatcc 
acggtgaact 
acgaaccacc 
aaaccgggtt 
tcgtgccggg 
ccgccgcggt 
aaggccaccg 
cgtagaacgc 
gcgtcggcag 
cggagaagta 
cgccggcgag 
gcacgtcgcg 
gcttctcgat 
ccaccgcggc 
tgtggtcctt 
cgtgggtcag 
aacaggcgat 
gcaggttgta 
tgtgcaccgg 
cgatgttcag 



ccccgcgcag 
ttccatgagc 
cctgagcccg 
cgccggcgac 
cgaccggctg 
gttcctggac 
gcgctccttg 
ggaagccctc 
gcacaacgac 
ggccgaggaa 
ggcacgcgtc 
caagcgcatc 
cgagcgcatc 
ccacatcgac 
ggccgacgag 
cggttcggtg 
ggccggcgac 
cgagccggag 
catgtggcac 
gctgttcacc 
gtcggggcag 
acccgcccgg 
gccctcacgc 
gcagagcgcc 
gacggtcacc 
gccggccccg 
cggttcgtcg 
cgggcagttc 
gcacggcgag 
cctcccgcac 
agcgcgccga 
tggcggcggc 
ccacgtcctc 
gttcctcgac 
ccaggccgat 
cggacagctc 
cgggtcgccg 
ccttgtcgcg 
agaagtcccg 
gcagcgcgaa 
gctcgtccac 
cggggatcaa 
ccggcacctc 
tggaggggtc 
tcaccaacca 
aggtcgcgta 
ggcactgcgg 
gcggatccgg 
atagaagccg 
accgggctga 
gtacgcgcac 
cggcggcatc 
gtcgcggcgc 
ctgcatgttg 
ggtgaacgcc 
gcagacgccg 
gtcgccccgc 
ctcgtacgcc 
catgatcact 
gtcgtcgccg 



gcagccggag 
acaccgaccg 
ctgctgcgcg 
gagggctggc 
gcccgcgccc 
ctgctcgtcg 
ttcaccccgc 
gccgaggggg 
ttctcgctgc 
caggggaagc 
caggaaggcc 
acacccgagg 
ggcccgatcg 
ctgggcacgg 
aagctgatgc 
ctcccgcggt 
ctggtgctgc 
ctcttcgaca 
tgcatcggcg 
cgcctgcccg 
ttgtcggccg 
cccggttccg 
acgggggagc 
gccgcgtcgc 
agaccgtcgt 
acacagcggc 
ccggggtcgg 
gtgcaggccg 
cgtgagggag 
caggttctgg 
gcgctcgggg 
gatcagcggc 
gacggccacc 
gatccggtcg 
gttgttcgcc 
ctgccgggac 
ctgcttgatc 
ctcctcgtcg 
gtcgaccttg 
ggactcgacg 
gatctgctgg 
ggtcttgcgg 
gtactgcgac 
ggcgctgatc 
gaccgggccg 
ctccggcggg 
cgcggcggtc 
ccggtgatga 
tcggcgctga 
cggctcatcg 
agctcgtccc 
agcgtgatgc 
agggcgagga 
gtcatccgga 
atcgcccgca 
ccctcgccgg 
ggccgcaccc 
aggttcagca 
ttggtgcgcg 
cagtccacga 



cgcccatccc 

caccgccctc 

aactccagtc 

tggtgacccg 

acgccgaccc 

tcgacgactt 

agttctcggc 

tactggccca 

cgttctccct 

tgatcgccgc 

aggacgagct 

acgacgtcat 

cctccggtct 

tgctgttcat 

gcggcgccgt 

acgcgaccgc 

tggacttcac 

tccggcgcgc 

cgccgctggc 

gcctgcggct 

gcctgacgga 

ggcaggcgga 

gggctcctca 

gtacgtcgcc 

cgtcctggtc 

cggggtccac 

gccgccgccg 

aacagcaccc 

gggatccggg 

cccaggcact 

tcgaactccg 

acgatgccct 

cggaatgcca 

tcgccgatcc. 

gtcgtctcgt 

agtgtgcccg 

tggatgagtc 

gtcgaactga 

ggcaccccga 

aggtccgccg 

atcaccggcc 

aaccggccgt 

ggggccccac 

cgggggtcgg 

ctggggagtt 

gggaagggcc 

cgcgcgtcgg 

actcctgctc 

gccgggccgc 

gcttgaagaa 

gccgctcggc 

cggggatgtc 

tggtgtccag 

agttgtaggc 

gatgggccat 

ccgagatgat 

cgtgcgcctc 

cggcgtccat 

gggtgatctt 

acaccggcgt 
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57060 

57120 

57180 

57240 

57300 

57360 

57420 

57480 

57540 

57600 

57660 

57720 

57780 

57840 

57900 

57960 

58020 

58080 

58140 

58200 

58260 

58320 

58380 

58440 

58500 

58560 

58620 : 

58680 

58740 

58800 

58860 

58920 

58980 

59040 

59100 

59160 

59220 

59280 

593-40 

59400 

59460 

59520 

59580 

59640 

59700 

59760 

59820 

59880 

59940 

60000 

60060 

60120 

60180 

60240 

60300 

60360 

60420 

60480 

60540 

60600 
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ggcaccggtg taggtcaccg cccaggcgga cgcgatcatc gtgaactccg ggacgatcac 60660 

ctcgtcaccg gggccgacgc ccagcgcgcg cagcgccagc gtcagcgccg tggtgccgga 60720 

ggagcaggcg acgccgaacg gcacgtcgtt gtacgcggcg aacgcctcct cgaaccgcct 60780 

gacgtacggc ccctgcgaag agatccagcc gccgccgacg gcctccgtca catagtcgag 60840 

ctcgcggccc tggagccacg gcatggacac cggatacgta aaggacatgg gttttgagtc 60900 

ttcctcggtc agtcggttgc cagggcggga aggccgagca gcaggtccgc cgccgcggcc 60960 

cggccgccgg cgtcccgcag cagcccggcg aagtgctccg cgcgctcggt gaaggagggt 6102 0 

tggtcgagca cgcgggtgat cttgtccagg acgtcctcgg tgtccacggt ctccggccgg 61080 

tccagggtca ggctcacccc gaagtcctgg ccccggatcg cctggtcgtc gcagtccacc 61140 

cacaacggcc ggaccaccag cggctttccg aagtacaggc cctcgtggta gccgttgcca 61200 

ccggcatggg tgaagaacgc cttcacgttc ggatgggcca gcacgtccag ctgcgacggc 61260 

acccagccct cgatccgcag gttgtccggc agctcggcgg ccggcggcag caactcctgt 61320 

tggccgcgcg ggagtttcca caacacctgg tggccccggc cgtccagtcg ccgggcgacc 61380 

tccaccagcg acgccacctg ctcacgggtc agccgggtga tcgtgccgaa gcccatgtac 61440 

accacggact tctgcgccga cagccagtcc gacaggccgt cgtcgtccgg tgcctggggc 61500 

agcggcggca ccatcgtgcc caccagccgc agcttcggat gcatcgggaa cgggtagtcc 61560 

aactccctta cggagtagca caagacctgc tccgcatggt cgatccgcgc catcatctgc 61620 

cgcgcctggg gcgcgatgcc cagctcggtg cggacccggt tgtcctcctc gacgaccttg 61680 

cggacgtccg acgtcaggaa catcccgagc gtccgcagcc ggaacagctg gttctcgatc 61740 

cgctgagcca gggacatcgc ggccggcagc cccgagtgcg gcaccgggaa acccgacggg 61800 

gtgtaggact tggcgaacgg gacgtgcgag gtgaggacgt tgctcggcac gaacggcacc 61860 

ccgagcacga acggaatgcc cttggtgatc gccagctcgt acccgaactg gcacatgctc 61920 

tcgatcacca tcagcgccgg ctcgacctcc tcgacgatct cctccaggcg gcggtacttc 61980 

gccatccgcg actccggcgc gaacgaatgc cgaatcaccg ccgcgtgcgc cttgaaccgc 62040 

gaccgctgcg tcacctccgc atacgtcgcg tcgtcccatg tgaccgccga catctgcgag 62100 

acggtgtcgc cgagcgacgc gaaccgaacc gggctgccgt ccaccacggc cgccacctcg 62160 

tcgcgcgctt tctcgtcggt ggcgaaccac aggtccgcca cgtcgcgccg ggacaattcc 62220 

ccggccagca cgagcagcgg attgagcagg ccgctttcgg cataactgac gaacaggatc 62280 

ggccgccgat tcgcgcccat ggacaacacc cctcggaatg tggcgggccg ccgggcccgc 62340 

gcgccacgca cccgcccggc ccggtcgccg ggtgagtgca ttcgccgacg ccgccacccg 62400 

aggcgcgtgt tgccggaagg aagggtcacc ggccggcacc cggaacgcgc cgcgtggaaa 62460 

acgggtcggt tacttggtct catgccacgg accggggaat cactagtctt cggcgcgcga 62520 

cggccctttc cgggccgtgt ggccaatgcc cgtccccggc gcccgtcatt ccttagggaa 62580 

aagtacagcg tttgcgaacg tacgatccgg cacgcagagg tgacctgagg ccaacttttc 62640 

cgcaggggtg agcaaggcat gacgatcgga gccgacgagg acccggtggt ggtcgtcgga 62700 

atggcctgcc gttatccggg tggggtcgcc ggcccggagg acctgtggga actggtccgc 62760 

accggccgcg acgcgaccac cgccttcccg gacgaccgcg gctgggacct ggccgcactg 62820 

gccggcgacg gacccggccg cagcgcgacc cgcgagggcg gattcctcac cggcgccgcc 62880 

gacttcgacg ccgccttctt cggcatgtcg ccccgcgagg ccgtctccac cgacccgcaa 62940 

cagcgcctcg tcctggagac cgcctgggaa gccctggagc gcgccggcat cgacccgcac 63000 

tccctgcgcg gcagccgcac cggggtcttc gtcggcgcca gcggccagga ctacgccgcc 63060 

gtcacccacg cctcgcccga cgacctggac ggacacgccc tcaccggcct ggcccccggc 63120 

gtcgcctccg gtcgcctggc gtacgtcctg ggcctcgaag gccccgccgt caccgtcgac 63180 

accacgtcct cctcgtcgct ggtcgcgctg cactgggcgg tccgcgccct gcgcgcgggg 63240 

gagtgcagca ccgccctggc cggcggcgtc acggtgatgt ccaccccggc cgccttcgtc 63300 

ggccacaccc gacagggcgg cctcgcgccc gacggccgct gcaagccgtt ctccgacgac 63360 

gccgacggca ccgcctgggc ggagggcgtc ggcatcgtcg tcctggagca cctgtccacc 6342 0 

gcccgcgccg ccggcaaccc cgtcctcgcc gtgctgcgcg gctcggccgt caaccaggac 63480 

ggcgcctccg acggcctcac cgcacccagc ggtcccgccc aggaacgcgt catccgcgcc 63540 

gccctcgccg acgcccgact cgcccccgcc gacatcgatc tcgtcgaggc gcacggcacc 63600 

ggcacccggc tcggcgaccc cgtcgaggcc cgggcgctgc tcgccgccta cggccaggac 63660 

cgggacccgg accgaccgct gcgcctcggt tccctgaagt ccaccctcgg ccacgcacag 63720. 

gccgccgccg gcatcggcgg agtgatcaag accgtcctga ccctgcggca cggcctgatg 63780 

ccgcgcatcc ggcacctggc cacccccacc cgccaagtcg actggtccca gggcgccgtg 63840 

gcccccctca ccgaccacac gccctggcca ccggccgacc gaccgcgccg cgccggcgtc 63 900 

tcctccttcg gcatcagcgg caccaacgcc catgtgatcc tcgaagaggc gccgcccgcc 63960 

gacgtccccg tcacccggcc cggcaccctc cgccccagca ccgtcccctg gccggtctcc 64020 

gccgccacgc ccgaagccct cgacgcccaa ctcgcccggc tccgcgccca cctgcgcacc 64080 

cactcggacc tggacccgct ggacgtcggc tactccctgg ccaccggccg cgccgcgctc 64140 

cgccaccggg cggtcctcct gccgcccgcc gacggcaccg ccgcggacgc cgtcgagcac 64200 
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gcccgcggtg 
cgcccgggca 
gacgcgctgc 
accgacgccg 
gtcgccctcc 
tccgtcggcg 
cgcctggtgg 
gcgctggagg 
gccgtcaacg 
accgcccgct 
cactcgccgc 
ttccaccagc 
atcaccagcg 
atcaccgcac 
ctgtccgcga 
ctgagcaagg 
ctcggcgtcc 
ctgcccacct 
gccgggcccg 
ggcggcaccg 
accgtcgccg 
ggcgacgagg 
cccgacgacg 
cgcgccgtca 
gccaccggca 
ccggccgcct 
ctcgccgacc 
cgcggcgcgg 
gaccacggac 
ggcaccgtgc 
cgggtccgca 
ggcgcgccgg 
gccgccccgc 
cggcccgcgg 
caccccaccg 
gacctggtgc 
acccgcagcg 
cggctcaccg 
cccgaacgtg 
ggggcgtacg 
ggccccgaac 
aactcgcctg 
ctgctcgaca 
cgacccgatg 
gagaaaatcg 
attggcgaac 
cttcccggcg 
gccatttccg 
gacggcagca 
gccttcttcg 
ctcgaagtcg 
agccggtccg 
accgaactgc 
gcctacaccc 
ctcgtcgccc 
atcggcggcg 
ggcctggcac 
gccgagggcg 
ccggtcctcg 
accgccccca 



cggcccacca 
tgggccgcga 
gcgccctgga 
cgctcctgga 
accgcctggt 
agatcgccgc 
ccgcccgcgc 
ccaccgagga 
gccccaccgc 
tcgccgaccg 
tgatggagcc 
cgtcgatccc 
ccgagtactg 
tggccaaggc 
tggcccgcga 
gacggcccga 
ccgtcgactg 
acgccttcca 
gcgccctcgg 
tctgctccgg 
ggcgggtcgt 
cgggctgcga 
ccgccctgca 
ccgtccacac 
ccctcggcag 
ggccgccggc 
gcggcttcga 
agatcttcgc 
tgcaccccgc 
ccgtcgcctg 
tccgccccac 
tcgtcaccgt 
ggacgccgcg 
cggcccgccc 
ccggccacct 
gcacccaggc 
tcttcaagga 
cccgcaccgg 
ctgcccaccg 
gagaggaact 
gccgggccgt 
cagaggtgcc 
tcatcgatga 
acgaggacgg 
tcgactatct 
tggaatccaa 
gcgtcaattc 
gattccccgt 
gcgccaccca 
gcatctcgcc 
cctgggaggc 
gcgtgttcgt 
acggccacgc 
tcggcctcga 
tgcacctggc 
tcacgatcct 
cggacggccg 
tcggtgtcct 
ccgtgctgcg 
acggcccctc 



gcgccgcacc 
actcgccgcc 
ccggcacctg 
ccggaccggc 
cgcgtccctc 
cgcccacgtc 
cacgctgatg 
cgaagtggcc 
ggtcgtcgtc 
cggccggcgc 
catgctcgac 
gctggtctcc 
ggtccggcac 
cggcgccgac 
caccctcggc 
ggagaccgcc 
gcccgccttc 
gcacgtgcgc 
ccaccccctg 
cgccctctcc 
gctgccggcc 
cgtcctgcac 
cgtccaggtg 
ccgccccgac 
caccccgccg 
cgacgccgaa 
ctacggcccg 
ggacgtggaa 
cctgctcgac 
gcacggcgtc 
cacgaccggc 
cgaggccctc 
gcaggcccgc 
cggcccggcc 
cgccgcgctg 
cgccgccgtc 
gctgggcttc 
actgcgcctg 
cctcggggaa 
caccaggttc 
cgcggaccgg 
ctcctcggac 
agagttcgaa 
accgatgcag 
ccggcgggtc 
ggacaacgag 
gccggaatcc 
cgaccgcggc 
cgaaggcgga 
gcgcgaggcg 
gctggagcgc 
cggctcctac 
cctgaccggc 
aggcccggcc 
ggcccagtcc 
caccgagccg 
ctgcaaggcg 
cgtcgccgag 
cggctccgcc 
ccaggaacgg 



gccgtcctct 
cgcttcccgg 
gacggcccgg 
tggacccagc 
ggcgtcaccc 
gccggcgtcc 
caggcgctcc 
ccgctgctcg 
gccggagccg 
accagccggc 
gccttccggg 
aacctcaccg. 
gtccgcgaca 
gtcctgatcg 
cccgacagca 
ttcgccggcg 
tacgccggca 
cactggccca 
ctcggctccg 
ctccgcaccc 
accgcgctgc 
gaactccacc 
cacgtcggcc 
caccacccgg 
tccgcagccg 
cccctcgacc 
accttccgcg 
tgcccgcccg 
gcggcccggc 
cggctgcacg 
acgctgaccc 
accgcccgcc 
ggcgagacgc 
ggcgaacccc 
ccgccggccg 
ctgggccacc 
gactcgttgg 
ccggccaccc 
ctcctcgccg 
gaggcgatcg 
ttggacgcca 
gaggacatcg 
accacataga 
gaaccccagc 
acttcagatc 
cccatcgcca 
ctgtgggacc 
tgggacctgg 
ttcctctacg 
actgctatgg 
gccggcatcg 
cactggggcg 
accgccgcca 
gtcaccgtcg 
ctgcgcgtcg 
tccgtcttcg 
ttctccgacg 
cggctctccg 
gtcaaccagg 
gtcatccagc 



tctccggcca 
tgttcgccga 
tgcgcgaggt 
ccgccctgtt 
ccgacttcgt 
tgagcctgga 
cggccggcgg 
gcgcacacct 
aggacgccgt 
tggccgtctc 
acgtcgtgag 
gtgaactcgc 
ccgtccgctt 
aactcggccc 
ccaccgacgt 
ccctcggccg 
ccggcgcccg 
ccccgccccg 
ccgtcgaact 
acccctggct 
tggaactcgc 
tcaccacccc 
ccgccgacac 
ccggcgactg 
aagccgccac 
tcgccgacca 
gcctgcgggc 
gcaccgccga 
acgccgccat 
ccgtcggcgc 
tcaccgcggt 
cgctgaccga 
ccgccgacgc 
tcccggacac 
cccgggagcg 
ccggccccga 
ccggcgtcga 
tcgtcttcaa 
caaccgcccc 
tgacgaacct 
tcgtctccgc 
acacggtgtc 
gaaattgttg 
aaggccagcc 
ttcgccgtgc 
tcgtcggaat 
tggtgcgttc 
aaaccctcac 
acgccgcgga 
acccccagca 
cccccacagc 
cgccctcggc 
gcgtgctgtc 
acaccgcctg 
gcgaatcctc 
tcgagttcag 
ccgccgacgg 
acgcgcagcg 
acggcgcctc 
aggccctcgc 




gggcagccag 
cgcactggac 
gatgtggggc 
cgccgtcgag 
cggcggccac 
ggacgcctgc 
cgcgatggcc 
cgcgctggcc 
gcggcaactg 
gcacgccttc 
ccgactgacc 
cggcagtgag 
cgccgacggc 
cggcggcgtg 
cgtccccgcc 
cctgcacacc 
ccgcgtcgaa 
cccgaacggc 
cgccgacggc 
cgccgaccac 
cgtgcgcgcc 
gccggccctg 
caccgggcgc 
gacccgatgc 
gggcggcacc 
ctacgaacgg 
cgcctggcga 
cgacgccccc 
ggcggtggac 
caccgcgctg 
cgacgtgcac 
cgaggaacgc 
ccgcccggcc 
caccgggtcc 
ccagctgctg 
ggccgtcggc 
actcgccgac 
cttccccacc 
cctcgacccc 
gccgcaggac 
actccgccag 
ggtcgacaga 
ctttcgttcg 
ggaccagcag 
ccgccgccgc 
gggctgccga 
cggcggcgac 
cggaaacggc 
attcgacgcc 
gcgcctcctc 
cctgcgcggc 
cgacgccgcc 
cggccgcctg 
ctcctcctcc 
gctcgccgtg 
cgcccagggc 
caccggttgg 
caacggccat 
caacggcctg 
ccggaccggc 



64260 

64320 

64380 

64440 

64500 

64560 

64620 

64680 

64740 

64800 

64860 

64920 

64980 

65040 

65100 

65160 

65220 

65280 

65340 

65400 

65460 

65520 

65580 

65640 

65700 

65760 

65820 : 

65880 

65940 . 

66000 

66060 

66120 

66180 

66240 

66300 

66360 

66420 

66480 

66540 

66600 

66660 

66720 

66780 

66840 

66900 

66960 

67020 

67080 

67140 

67200 

67260 

67320 

67380 

67440 

67500 

67560 

67620 

67680 

67740 

67800 
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ctgacccccg ccgacatcga cgccgtcgag gcgcacggca ccggcacccg gcccggcgac 67860 

cccatcgagg cccaggccct gctcgccacc tacggccagg gacacacccc cgaccagccg 67920 

ctgtggctcg gctccctgaa gtccaacatc gggcacaccc aggcggccgc cggcgtcgcc 67980 

ggtgtcatca agatggtcat ggcgctgcgc cacggccacc tgccgccgac cctgcacgcc 68040 

gacgcgccct cctcgcacgt ggactggtcc gccggatcgg tacgcctgct gaccgagggc 68100 

cagcagtggc cggagaccgg acgtccgcgc cgggccgcgg tgtcctcgtt cggcatcagc 68160 

ggcaccaacg cgcacgccct gctggaacag gcaccccacc ccgcggacac cgcggacgcc 68220 

ggcgacgacg ccgcgcccac cgaaccggcc ggcgcgcccg ccgcgctgcc ctggatcgtc 68280 

tccggacact ccccgcaggc gctgcgcgac caggccgccg ccctggccgc cagggtcgag 68340 

accgaccccg cgctccgccc ccaggacatc gggcacaccc tgcacaccgc ccgcgccctg 68400 

ctcgaacgac gcgccgtcgt cgtcgccccc gaccgcgccg aactcctcgc ggctacccac 68460 

gagttggccg ccggccggtc cgcgaacgcc gtcgtcgagg gcctcgcgga cgtcgagggt 68520 

cggacggtgt tcgtgttccc cggtcagggt tcgcagtggg tggggatggg ggcccaactc 68580 

ctcgatgagt cggcggtgtt cgcggagcgg attgccgagt gtgcggcggc actcgccgag 68640 

ttcaccgact ggtcgctggt cgatgtgctg cggggtgtgg tgggtgcgcc gtcgttggag 68700 

cgggtcgatg tggtgcagcc ggcgtcgttc gcggtgatgg tgtcgttggc tgcgttgtgg 68760 

ggttcccgtg gtgtgttgcc ggatgcggtg gtggggcatt cgcagggtga gatcgctgcc 68820 

gcggtggtgt cgggtgcgct gtcgttgcgg gacggggcgc gggtggtggc gctgcggagt 68880 

caggccattg gtcgtgcgtt ggcggggcgg ggcgggatga tgtccgtcgc gctgtcggtg 68940 

gacgtgctcg aaccgcggtt ggtcgagttc gaggggcggg tgtcggtggc cgccgtcaac 69000 

ggcccgcgct ccgtcgtggt cgccggcgag cccgaggcgc tggacgcgct gcacgcccgg 69060 

ctgaccgccg acgacatccg ggcccgccgg atcgcggtgg actacgcctc gcactcgcac 69120 

caggtcgagg acctgcacga ggaactgctg gaggtgctgg cggagctggc gccgcgcacg 69180 

tcggaggtgc cgttcttctc gaccg.tgacc ggcgactggc tggacaccgc gcggatggac 6 924 0 

gccggctact ggttccgcaa cctgcgcgga cgggtgcggt tcgcggacgc ggtggcggac 69300 

ctgctggcgg cggagtaccg cgcgttcgtc gaggtcagct cgcacccggt gctgacgatg 69360 

gcggtcttgg acctgatcga ggaggccggg gtcacggccg tcgcgaccgg caccctgcgc 69420 [ 

cgtgaccagg gtggcgcggg ccgcttcctg ctgtcggccg ccgaggtctt cgtgcgcggt 69480 

gtggacgtgg actgggcggg ggcgttcgag gggaccggtg cggcccgggt cgacctgccc 69540 

acctacgcct tccagcgcga gcggtactgg aacacccgca ccgccgccga ccgcaccccg 69600 

gccgacgccc cgatggacgc cgaattctgg gccgccgtcg aacaggcgga cgtctccgcg 69660 

ctgaccgccg cgctcggcac cgacgaggac tccgtcgccg ccatcctgcc cggcctcacc 69720 

tcctggcgcc gggcccgctc ccagcgcacc accctcgact cctggcgcta ccgcgtcacc 69780 

tggacgcccc tcgcccaggt gccccgcgcc accctgaccg gcacctggct gctggtcacc 69840 

accgacggca tcgacgacac cgatgtggca ggggcgttgg agagctacgg cgccgaggtg 69900 

cgccggctgg tcctggacga ggagtgcacc gaccgcgccg tcctgcggga gcggctggcc 69960 

ggcgcggagg acgtgaccgg catcgtctcc gtcctcgccg ccgccgagga cgacgccgca 70020 

cgccaccccg gcctcacccg gggactcgcg ctcaccgtct ccctcgtcca ggccctgggc 70080 

gacgccgagg cgaccgcgcc gctgtggttc ctgacccgcg gcgccttcgc caccggcccg 70140 

tccgaccccg tcacccggcc cctgcagagc cagatcgcgg gcgtcggctg gaccaccgcg 70200 

ctggagcacc cgcagcgctg gggcggcacc gtggacctgc ccgacaccct cgacgcccgg 70260 

gccgcccagc ggctcgccgc cgcgctgtcc ggcgccctcg gcgccgagga ccagctcgcc 70320 

gtccgcgccg ccggggtact ggcccgccgc atcgtgcgtg ccggacaccg cgccggacga 70380 

ccggcacgga cctgggcgcc gcgcggcacc accctgatca ccggcggctc cggcaccctc 70440. 

gccccgcagc tcgcccgctg gctggccgaa cgcggcgccg agcacgtggt gctggtcagc 70500 

cggcgcggtg ccgacgcccc cggagcgccc gaactcatcg cggaggcagc cgagtcgggc 70560 

accgaggtga ccgtcgccgc ctgcgacatc accgaccgcg acgcggtcgc cgcgctgctg 70620 

gccgacctca cggccgacgg ccgcaccctg cgcaccgtca tccacgccgc cgccgccatc 70680 

gagctgtccg cgctcgccga caccaccgtg gcggagttcg ccgacgtcgt gcacgccaag 70740 

gtcaccggcg cacggatcct cgacgaactg ctcgacgacg cggaactgga cgacttcgtc 70800 

ctgtactcct ccaccgccgg catgtggggc agcggcgtgc acgccgccta cgtcgccggc 70860 

aacgcctatc tgtccgcgct cgccgagcag cgccgcgccc gcggactgcg caccacctcc 70920 

atccactggg gcaagtggcc cgacgaccgg gcacgcgagc tggccgaccc gcaccggatc 70980 

cgccgcagcg gtctggagta cctcgacccc gagctggcgc tcaccgcgct ccagcacgtc 71040 

ctggacgacg acgagaccgt catcggcctc atggacatcg actgggacac ctaccacgac 71100 

gtgttcaccg cgggccggcc cgcgcacctc ttcgaccaga tccccgaggt gcggcgccgc 71160 

ctcgaccagg catccgtccc ggaccccgcg ggcccggccg ccgacggcct ggccgcccgc 71220 

ctgcacggcc tcgccgccgc cgaacaggac cggctgctgc tcaccctggt ccgcaccgag 71280 

gccgccgccg tcctcggcca cgcctcggcc gagtccttcc ccgagcgccg cgccttccgt 71340 

gacctcggct tcgactcggt caccgccgtg gacctgcgca accggctcgt ggccggcacc 71400 
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ggactgcggc tgccctcgac gatggtcttc gaccacccca actgcgcggc gctcgccgcg 714 60 

ttcctgaaga cgacggcgct cggcgtcccc ggcgccgcac cgcagcagca cgccgctacc 71520 

ggcaccccgg ccgacgacga cccgatcgcc gtgatcggca tgagctgccg ctaccccggc 71580 

ggcgccgcca cccccgagga actgctgcgg ctcgccctcg acggcgccga cgtcatctcg 71640 

gagttccccg cggaccgcgg ctgggacgcc cggggcctgt acgacccgga ccccgaccgc 71700 

cccggccaca cctactccgt ccagggcggc ttcctccacg aggccgccgg cttcgatccc 71760 

ggcttcttcg ggatctcccc gcgcgaggcg gtcgccatgg acccgcagca gcggctcctg 71820 

ctggagacct cctgggaggc gttcgaacgc gccggtatcg accccgcgtc actgcgcggc 71880 

agcgccgccg gcaccttctt cggcgccagc taccaggact actcctccac cgtgcagaac 71940 

ggcacggggg agtccgaggc gcacatggtg accggcaccg cggccagtgt cctgtccggc 72000 

cgggtctcct acctgctcgg cctggagggc cccgcggtca ccgtggacac cgcctgctcc 72060 

tcctcactgg tcgccctgca cctggcctgc cagtccctgc gcgacggcga gagctccctc 72120 

gcgctggccg gcggtgcggc cgtgatggcc accccgcacg cgttcgtcgg cttcagccgg 72180 

cagcgtgccc tggccaagga cggccgctgc aagccgttct ccgacaccgc cgacggcatg 72240 

acgctcgccg agggcgtcgg cgtcgtcctg ctggagcgcc tgtcccacgc ccgcgccaac 72300 

gggcaccggg tactggccgt gatccgcggt tccgccgtca accaggacgg cgcctccaac 72360 

ggcctgaccg cgcccaacgg cccgtcccag cagcgcgtca tccgccaggc gctcgccaac 7242 0 

gccggcctga ccggcgccga cgtcgacgcg gtcgaggcgc acggcaccgg caccaagctg 72480 

ggcgacccca tcgaggccca ggccctgctc gccacctacg gccaggaccg cgacgccgaa 72540 

cggccgctgc tgctgggctc ggtgaagtcc aacatcggcc acacccaggc cgcggccggc 72600 

gtcgccggcg tcatcaagat ggtgctggcc atggacgccg gcgaactgcc cggcaccctg 72660 

cacctcgacg cgccctccag ccacgtcgac tggaccgccg gcgccgtgga actgctgcgc 72720 

gggcgcaccc cgtggcccga gagcgggcgc ccccgccggg ccggtgtctc ctcgttcggc 72780 

atcagcggca ccaacgccca cctgatcctc gaacaggccc cggccaccga gccgccagcc 72 840 

gaccccgacc gcctccggga caccgccacc gacaccgtcg tcccctggcc gctcgccgcc 72900 

aagtccccgg ccgccctgcg cgcccaggcc gcccggctcc tcgccaccgt cgagcacgac 72 960 

cccgacctcc cgcccgcccc cgtgggccac gccctggcca ccacccgcgc cgccctcgaa 73020 

caccgcgccg tcgtcgtcgg cgagcgccgc gaggacttcc tgcgcggcct ggccgccctg 73080 

tccaccggcg cctcgacggc cggcctggtc agcggcatcg ccggccccga ccccgaggga 73140 

gcggtcttcg tcttccccgg ccagggatcc cagtggtggg gaatgggccg cgaactcctc 73200 

gccacgtccg aggtgttccg caccgcgatc gatgactgcg cgacggccct cgccccgtac 73260 

gtcgactggt cgctgcacga cgtcctggcc ggcgagggcg accccgccct gctggagcgg 73 320 

gtggacgtgg tccagcccgc gctgttcgcc atgatggtcg ggctgtccgc gctctggcgc 73380 

tcccacggcg tcgtcccggc ggccgtggtc ggccactcgc agggcgagat cgccgcggcc 73440 

tgcgtcgccg gagccctcag cctggccgac gccgcccgcg tggtggcgct gcgcagccag 73500 

gcactgccgc aactgtccgg acgcggcggc atgatgtcgg tctccgcccc cgtagagcgg 73560 

gtcaccgcac tcctcgcccc gtggcaggag gcgctgtccg tcgccgcggt caacggcccc 73620 

tcgtccgtgg tcgtctccgg cgacaccgac gcgctcgacg ccctgcacac cgcctgccag 73680 

gaacagggcg tgcgggcccg caaggtgtcc gtggactacg cctcgcacgg gcggcacgtc 7374 0 

gaggccgtcc gcgacgaact cgcccgcgtc ctcgcgccgg tcgacccgcg cgcccccgag 73800 
gtgccgttct actcgacggt caccggcgac cgcgtggacg acgccgcctt cgacggcgcc 73860 
tactggtaca ccaacctccg ccagaccgtc cgcatggagg aggccacccg cgccctcctc 73920 
gccgccggac accgcgtctt catcgaggtc agcccgcacc cggtgctcgc cgccccgatc 73980 
caggagacgc aggaggccgt agcggaggcc accggcgggt ccgcggtggt cctcggctcg 7404 0 
ctccgccgcg acgagggcgg cccgcggcgc ttcctgacgt cgctcgccga ggcccacacc 74100. 
cacggcgccc cggtcgactg gaccaccacc ttcgcccggt ccgcctacca gccggtggac 74160 
ctgccgacct accccttcca acgacaggac ttctggcccg aggcccggcc cgccaccccg 74220 
gccgccggcg ccgacgcgtc cgacgccgcg ttctggcaac tggtcgagaa ccaggacctc 74280 
gccgcgctcg ccgacgcgct cggcgtcccc gccgacgacg agcacaccgc gctcggcacc 74340 
gtgctgccgg ccctgtccgc ctggcgcgcc aaggcccagg cccgcacccg gatcgacgaa 74400 
ctccgctacc acgtccagtg gacccgggtc gccgagcccg cggcggcccc caccaccggc 74460 
cggctgctgg tcgccgtccc gccggaccac gccgacgccc cctgggtcgc cgcggcgctc 74520 
gacgccctgg gcaccgacac cgtccgcttc gaggccaagg gcaccgaccg cgcgggatgg 74580 
gccgcacaga tcgcccaact cgtcgaggac ggcgaggagt tcaccggcgt ggtgtcgctg 74640 
ctggccgccg ccgaggatct ccacccggac ttcggctcgg taccgctggg gctggggcag 74700 
accctcgtcc tcgtccaggc cctcggcgac gccggcctga ccgcgcccct gtggtgcctg 74760 
acccgcggcg ccgtcgccac cggccgcgac gacgccctcg acagcccgac ccagggcgcc 74820 
ctgtggggcc tcggccgggt cgtggccctg gaacaccccg accgctgggg cggcctgatc 74880 
gacctgcccg ccaccctcga cgcccgcgcc gcggcccgcc tcaccggcct gctcgccgac 74940 
cccgccggtg aggaccaact cgccgtccgc gccaccggcg tgctcgcccg ccgcatggtg 75000 



http://www.ebi.ac.uk/cgi-bin/emblfetch?id= AF2639 1 2&Submit=Go 1 0/1 1/2002 Christina Belisario 



cacgccgcgc 
atcaccggcg 
gccgcccacc 
cgggccgaac 
cgcgacgccc 
ttccactccg 
ctcgacgcgc 
cccctcgacc 
ggccagcccg 
tccctcgacc 
accgtccccg 
cacgcgatcg 
ctcatggact 
agcacggtgc 
gacgtggact 
cgcggccggg 
acccccgacg 
gccgtcgaac 
gtcttcgacc 
accgcccccg 
gcgctcgcca 
aaactcgccg 
tccctcgacg 
tgaaagagag 
gtcgtccctg 
ggtcgagccc 
cgaggacctg 
ccgcggctgg 
cggcggcttc 
cgaagcggtc 
ggaacgcgcc 
caccaccggc 
ctcgaccacc 
cgagggcccg 
ggccgtccag 
catggccacc 
ccgctgcaag 
gctggtcctg 
gcgcggctcc 
ctcccagcag 
cgacgccgtc 
gctcctggcg 
caacatcggc 
gttgcagcgc 
ctggaccgcc 
cgcccgccgc 
cgaacaggcc 
cgtcccgtgg 
caccggccac 
cgacggacgc 
ccacggaacc 
cccgggcatg 
gatcacagcg 
cgacgccgac 
cgccctctac 
catcggcgag 
cctcgtcgcc 
gatccgcgcc 
cgtcaacggg 
ggcgcgcttc 



cgtccgcgcc 
gcaccggcgg 
tggtcctgac 
tggaggccct 
tggccgccct 
ccggcgtggc 
tgctgcgcgc 
tcgacgcgtt 
gctacgccgc 
tgcccggcgc 
aggtccacga 
gcgcgctcca 
gggaggcgtt 
ccgaagccgt 
ccgcgacccc 
ccctggtcga 
ccatccccgc 
tgcgcaaccg 
accccacccc 
aggacgccgg 
ccgtccccat 
acggagacgc 
acatggacgc 
ctggagccca 
aaggagatcg 
gtcgcggtcg 
tgggagctgg 
gacctggaga 
gtcgaggacg 
gccatggacc 
ggcatcgacc 
caggactacg 
ggccacgccg 
gccgtcaccg 
gcgctgcgcg 
ccgggcccgt 
cccttctccg 
atgcggctct 
gccatcaacc 
cgcgtcatcc 
gaggcccacg 
acctacggac 
cacacccagg 
ggcgtgctgc 
ggctccgtcg 
gccggcgtct 
cccaccgccc 
gcgctctccg 
ctcgccgaca 
gccaccttcg 
gccggcgaag 
ggacgcgaac 
ctcctcgaca 
ctgctgaacg 
cgcctggtcg 
ctcgccgccg 
gcccgcgccc 
accgaggacg 
cccacctccg 
accgcccagg 




ccgcaccggg 
catcggcggc 
cagccggcgc 
gggcgcccgc 
gctggccgac 
cgacggggac 
caaactgacc 
cgtgctcttc 
cgccaacgcc 
gtccgtcgcc 
gcgactgcac 
gcagatgctg 
cgcaccgagc 
ccgcgcggtg 
gccgctccgc 
ggcggtccgc 
cggccgtgcc 
gctgcgcacc 
cacggccctc 
caccggccgc 
cggacggctg 
gaccgacgcc 
cgaagccctg 
ccatgagcac 
agcggctgcg 
tcggcatcgg 
tcgccgaggg 
agttggccgg 
ccgccggctt 
cgcagcagcg 
cgtccaccct 
gcgaggtcat 
ccagcgtcat 
tcgacaccgg 
gcggcgagtg 
tcgtcgcctt 
accgggccga 
ccgacgccca 
aggacggcgc 
gcgccgcgct 
gcaccggcac 
aggaccggcc 
ccgcctccgg 
cgcgcagcct 
acctcctcga 
cctccttcgg 
ccgaagagcc 
cccgcaccgc 
cccccgacgc 
aacaccgcgc 
gcccctgcgc 
tccacgcccg 
cccacctcga 
acaccggctg 
cgtccctcgg 
cgcacgtcgc 
gcctcatgca 
aggtcacccc 
tcgtcgtcgc 
accgcaagac 



cgccgctggc 
cgggtcgccc 
ggcccggacg 
gtcaccctcg 
ctccccgccg 
gcccgggcag 
gccgcccacc 
tcctccggcg 
tacctcgacg 
tggggcacct 
cgccaagggg 
gaggacgacg 
ttcaccgcga 
accggcgacc 
cgccacctgg 
gccgaggcgt 
ttccgcgacg 
gccctcggcc 
gccggccacc 
cccgacgacc 
cgcaaggcgg 
cccgcccccg 
ctgcggctgg 
gaaccccgac 
ccggcagaac 
ctgccgcttc 
gcgcgacgtc 
cggcggcgag 
cgaccccggc 
catcctgctg 
gcgcggtacc 
caaggcgtcc 
ctccggccgg 
ctgctcctcg 
ctccatggcc 
caccgcgcag 
cggcaccggc 
gcgcgagggc 
ctccaacggc 
ggacagcgcc 
caccctcggc 
gcggcccctg 
tgccgccggc 
gcacgccacc 
cgagacggtc 
catcagcggc 
caccaccgaa 
cgccgccctc 
cgaccccctc 
cgtcctgctc 
cgtcctcttc 
cttcccggtg 
ccgcccgctg 
ggcccaaccc 
cgtcaccccc 
cggggtcctc 
ggccctgccg 
ccacctcacc 
cggcaccgag 
cacccggctg 



ggggccgcgg 
gctggatggc 
cacccggcgc 
ccgcctgcga 
accagccgct 
ccgacctgac 
acctgcacga 
ccgcggtctg 
ccctcgccgc 

ggggcgaggt 

tccgcgccat 
acaccaccct 
cccgacccag 
cgggcaccac 
sggagctgtc 

ccgcgaccct 
tcggcttcga 
tgccgctgcc 
tcggcgcgct 
ccgacgcccg 
gcctcctcga 
aggccgacgc 
ccaccgagaa 
aagtacgtcg 
gaacagctgg 
cccggcggcg 
atcgggccgt 
ggcggcagcc 
ttcttcggca 
gagatcacct 
cccaccggcg 
gccgaggacg 
ctctcctaca 
tccctggtcg 
ctggccggcg 
agcggcctgg 
tggggcgagg 
cgcccggtcc 
ctgaccgctc 
cacctcaccg 
gacccgatcg 
tggctcggct 
gtgatcaaaa 
gaacccacca 
gcctggcccg 
accaacgccc 
cccaccgtcc 
gacgcccagc 
gacgtcggct 
cccgacggca 
tccggccagg 
ttcgccgcgg 
cgcgaggtcg 
gccctgttcg 
gacttcgtcg 
tccctcgaag 
cgcggcggcg 
gacgacgtct 
gaagccgtcg 
cgggtcagcc 
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cacctgcctg 
cgagcacggc 
cgccgcactc 
cgtcgccgac 
cacctccgtc 
cctagatcag 
gctgaccgcc 
g99cagcggc 
ccaccgcagg 
cggcatggcc 
ggaaccggac 
cgccgtgacc 
cgccctgttc 
ggccggcgac 
cgccgccgag 
cggccacgac 
ctcggtcacc 
ggccgcgctc 
gctcttcggc 
catccgcgag 
catggtgctg 
cccctcggaa 
ctcggcgaac 
aggcactccg 
tggcggcggc 
tcacctcccc 
tcccgcagga 
tcgcgcaggt 
tctccccgcg 
gggaagccct 
tcttcgtcgg 
tcgaggtcta 
ccctcggcgc 
ccctgcactg 
gcgcgtccat 
ccgccgacgg 
gcgccggcat 
tcgccgtgct 
ccaacggccc 
ccgccgacat 
aggcccaggc 
cggtgaagtc 
tgatcatggc 
cggacgtcga 
agaccggacg 
acgtcatcct 
gccccgccgt 
gcgcccgcct 
acgcgctcgc 
ccgaactcgc 
gctcccagcg 
ccttcgacga 
tctggggcac 
ccgtcgaggt 
gcggccactc 
acgcctgcac 
cgatgctcgc 
cgatcgccgc 
ccgcgatcgg 
acgccttcca 



75060 
75120 
75180 
75240 
75300 
75360 
75420 
75480 
75540 
75600 
75660 
75720 
7578Q 
75840 
75900 
75960 
76020 
76080 
76140 
76200 
76260 
76320 
76380 
76440 
76500 
76560 
76620 
76680 
76740 
76800 
76860 
76920 
76980 
77040 
77100 
77160 
77220 
77280 
77340 
77400 
77460 
77520 
77580 
77640 
77700 
77760 
77820 
77880 
77940 
78000 
78060 
78120 
78180 
78240 
78300 
78360 
78420 
78480 
78540 
78600 
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ctcgccgctc atggacccga cgctggcgga attccgcgcc gtcgccgcgg gcctgaccta 78660 

ccacgagccg cgcatcccgg tcctctccaa cctcaccggc accgtcgccg ccgtcgccga 78720 

cctgtgctcc gccgactact gggtccgcca cgtccgcgag gcggtccgct tcgccgacgg 78780 

cgtcaccgcc ctcaccgacc gcggcgtgac cacgctcgtc gaactcggcc cggacggcgt 78840 

gctgtccgcc atggcccagg aatccctgcc ggacggcgcc gccgccgtgc cgctgctgcg 78900 

caaggaccgc cccgaggagc tctccgccgt caccggcctg gcccgcgccc acgtccgcgg 78960 

cgtcacggtc cgctgggccg gcctcttcga cggcaccggc gcgcgccgcg ccgacctgcc 79020 

cacctacccc ttccagcacc agcggttctg gccgaccgcg gcccgcgccg cccaggacgt 79080 

caccgccgcg ggactgggcg ccgccgacca cccgctgctc ggcgccaccg tcgaactcgc 7 914 0 

cgacggggcc ggctacttgt tcaccagccg gctctccgtc cggacccacc cctggctcgc 79200 

cgaccacggg gtccagggcc gggccctgct gcccggcacc gccttcgtcg aactggccgt 79260 

ccgcgccggc gacgaggccg gctgcgaccg cgtcgaggaa ctgaccctgg ccgcccccct 79320 

ggtgctgccc gagcgcggcg gcgtccaact ccaggtccgc gtcggcgccc ccgacgccgc 79380 

cggccgccgc accctcggca tcttctcccg cgtcgaggac ggcttcgacc tgccctggtc 79440 

gcaacacgcc accggcgtcc tgaccgccgg cgccggcgcc cccgacccca ccttcgacgc 79500 

caccgtctgg ccccccagcg gcgccgaacc cgtcgacctc accggcgcgt acgagcgcct 79560 

ggccgcactc ggcttccagt acggccccgc cttccagggc ctgcgcgccg cctggcgccg 79620 

cgacaccgag gtctacgccg aagtggccct gcccgacggc gcggacaccg accccgccgc 79680 

cttcggactg cacccggccc tgctggacgc cgcacaacac gccgccgcct acgccgacct 79740 

cggcgccatc agccgcggcg gcctgccgtt cgcctgggaa ggcgtctcgc tcgccgccgc 79800 

cggcgccacc accgtccgcg cccggatcgc cccggccggc gaggacaccg tcaccatcgc 79860 

cgtctacgac gccgccggcg gcaccgtgct gtccgtcgac tccctggtct cccgcgaggt 79920 

ccccgccgac gcacccggcg ccgccggcac cgtccaccgc gactccctct tccacgtcga 79980 

gtggaccccg ctccagggcc gcccgggccc cgcaccggcc accgtcgccg tcctcggccc 80040 

cgacccggac gccctcgccg acaccctccg cgccaccggc atccggacca ccgccccccg 80100 

cgacctggcc gccctcgccg acgccgaagg gcccgtcccc gacctggtcg tcaccaccct 80160 

caccaccacc ccgggcgccc ccgtccccga cgccgcgcac gccaccaccg ccgccgtcct 80220 ' 

cgccctcgcc caacagtggc tcgccgacga ccgcttcgcc gacgcccgcc tggtcctcgt 80280 

cacccgcggc gccaccgacg gcaccgaccc cgccgccgcg gccgccggcg gcctgatccg 80340 

caccgcccgc accgagaacc ccggccgttt cgccctcctc gacctcgccc ccgacaccgg 80400 

ccggcccgac cccgagaccc tggccaccgc cctggccgcc agccacgacg agcccgacct 80460 

cgccgtccgc ggcaccgacg tgcacgccgc ccgcctggcc cgtgtcccgc tcgccaccga 80520 

acccaccacc tggaacccgg acggcaccgt cctgatcacc ggcggcaccg gcggcctggg 80580 

cgcggtcctc gcccgccacc tggtcgccac ccacggcgtc cgccacctgc tgctcgccag 80640 

ccgccgcggc ccggccgccg acggcgccga cgacctgacg gccgaactca ccgggctcgg 80700 

cgccaccgtc cacatcgccg cctgcgacgt cgccgacccg gccgccctcg ccgacctgct 80760 
cggcaccgtc ccggccgggc acccgctcac cgtcgtcgtc cacaccgccg gcgtcgtcga • 80820 

cgacggcgtc ctcggctccc tcaccccgca gcgcctggac accgtcctgc ggcccaaggc 80880 

cgacgccgcc tggcacctgc acgaggcgac ccgccacctc gacctggacg ccttcgtcct 80940 

cttctcgtcc gtcgccgcca ccctcggcag ccccggacag gccaactacg ccgccggcaa 81000 

cgccttcctg gacgccctcg ccgcccggcg cgccgccacc ggcctgcccg ccacctccct 81060 

cgcctggggc ccgtggaccc agagcgtcgg catgacaagc agcctgtccg acctcgacgt 81120 

cgagcgcatc gcccgctccg gcatgccccc gctgaccctg gaacagggca ccgccctctt 81180 

cgacgcggcc ctggccgccg ggcccgccgc cctcgccccg gtccgcctcg acctgcccgt 81240 

cctgcgcacc cagggcgaca tcgccccgct gctgcgcggc ctgatccgca cccccgtgcg 81300 

gcgcaccgcc gcccaggtct cgcagaccgc cgacggcctc gcccagcggc tcgccggcct 81360 

cgacgccgcc gcccgccggg aagccctcct ggaactcgtc cgcacccaga tcgcccaggt 81420 

cctcggccac gcggacgcca ccgaggtgga gaccggccgc cagttccagg acctcggctt 81480 

cgactccctc accgccgtcg aactccgcaa cgccctgaac accgccaccg gcctgcggct 81540 

gcccgccacc atggtgttcg actacccgac accacacgcc ctcgccgacc acctgcgcga 81600 

cgaactcctg ggcaccgagg ccgagtcgac caccgccgtc cccgtgccga cccgtaccgc 81660 

cggcaccgac gacccgatcg tcatcgtcgg catggcctgc cgctaccccg gcggcatcgc 81720 

ctcacccgag gacctctggc gcctggtcag ccagggcgcc gacgccactg gcccgttccc 81780 

caccaaccgc ggctgggacc tggacaacct ctacgacccc gaccccgacc gcccgggccg 81840 

cacccacgtc cgcgccggcg gcttcctgca cgacgccggc tccttcgacg ccgacttctt 81900 

cgggatgagc ccgcgcgagg cgatggccac cgactcccag cagcgcctgc tgctcgaact 81960 

ctcctgggaa gccgtcgaac gcgccggcat cgaccccgcc tcactgcgcg actccggcac 82020 

cggcgtcttc gccggcgtca tgtacaacga ctacggcacc accctgaccg gcgacgagta 82080 

cgaggcgttc cgcggcaacg gcagcgcccc gagcgtcgcc tccggccgcg tctcctacac 82140 

cctcggcctg gaaggcccgg ccgtcacggt ggacaccgcc tgctcttcct ctctggtcgc 82200 
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cctgcactgg gcggcgcagg cgttgcgggc gggggagtgc tcgttggcgt t^^ggtgg 82260 

tgtgacggtg atgtcgacgc cgagcacgtt cgtggagttc tcgcggcagc ggggtctggc 82320 

gcctgatggt cgttcgaagg cgttcgccga ggccgcggac ggcgtggcct ggtccgaggg 82380 

cgtcggcatg ctggtcctgg agcggcagtc ggacgcggtg cgcaacggtc acgagatcct 82440 

ggccgtggtg cgcggctcgg cggtcaacca ggacggtgcg tccaacggtc tgaccgcgcc 82500 

caacggcccg tcccagcagc gggtgatccg tcaggcgttg gccagtggcg gcctgtccac 82560 

ggccgacgtg gacgccgttg aggcgcacgg cacgggtacg acgctcggtg acccgatcga 82620 

ggcccaggcg ctcctggcca cctacggtcg cgaccgcgac cccgagaacc cgctgctgct 82680 

cggctcgatc aagtccaaca tcggtcacac ccaggcagcg gccggtgtcg ccggtgtcat 82740 

caagatggtc atggcgatgc ggcacggcgt gctgccgcag accctgcatg tcgacgcgcc 82800 

gtcctcgcac gtcgattgga gcgtcggcgc cgtcgaactg ctcaccgagc agaccgcctg 82860 

gccggagacc ggccgggccc gtcgcgccgg tgtctcctcc ttcggcatca gcggcaccaa 82920 

cgcccacgtc gtcatcgagc agtccccgac cgccgtcccc gccacgcccg cgtccgccga 82 980 

ccggtccgtc gaggaaccgc cggccgtccc ctgggccctg tccggcaaga cccccgacgc 83040 

cctccgcgac caggccgccc gcctcctcgc ccacgtcgag gcccaccccg cactgcgccc 83100 

cgtcgacatc agctactccc tgatcgccac ccgcaccgcc ttcgaccacc gcgccgtcgt 83160 

cctcggcacc gaccgcgccg aggccctgcg cgccctcacc gccctcgccg ccggcgagac 83220 

cgacccggcc gccctcaccg gcaccgtccg caccggccgc accgccttcc tcttctccgg 832 80 

ccagggctcc caacggctcg gcatgggccg cgtcctctac gagcggttcc ccgccttcgc 83340 

cgaagccctc gacaccgtcc tcaccgccct cgacgcggaa ctcggccacc ccctccgcga 83400 

catcatctgg ggcgaggacg ctcaactcgt cgaccggacc ggctacaccc aacccgccct 83460 

gttcgccatc gaggtggcac tcttccgcct cctggaagcc tggggcatca caccggactt 83520 

cgtggccggc cactccatcg gcgagatcgc cgccgcacac gtcgccggcg tgctctccct 83580 

cggcgacgcc tgccgcctcg tcgtggcccg cgccgtgctg atgcagtcgc tgcccgaagg 83640 

cggcgcgatg atcgccgtcc aggccaccga ggacgaggtc ctgcccctcc tcaccgacga 83700 

cgtctcgatc gccgccgtca acagcccgac ctccgtcgtc gtctccggct acgagaacgc 83760 

caccctcgcc gtcgcccggc acttcgccga ccagggccgc cgcaccacgc ggctgcgcgt 83820 : 

cagccacgcc ttccactcgc cgctgatggc gccgatgctc gacgacttcc gcgccgtcgt 83880 

cgagagcctc accttcaccg cccccacgac ccccgtcgtc tccaacctga ccggcgaact 83940. 

ggccccggcc gaggcgctct gctcggccga ctactgggtc cggcacgtcc gcgaggcggt 84000 

ccgcttcgcc gacggcatcc gcaccctcgc cgaccgcggc gtcaccacct tcgtcgaact 84060 

cggccccgac agcgtgctgt ccgccatggc ccaggagtcc gcccccgaag gcgccggcac 84120 

catcccgctc ctgcgccgcg accggcccga ggaacaggcc gtcctggccg ccctctgcca 84180 

cctccaggtg ctcggcgtcg aggccgactg gtccgccacc ttccgcggcc tcgaccccgt 84240 

ccgcgtcgac ctgccgacct acgccttcca gcaccgctgg ttctggcccg ccgcccgacc 84300 

cgcccgcccc gacgacgtcc gcgccgccgg cctgggcgcc gccgaacacc ccctcctcgg 84360 

cgccgccgtg caactccccg acgacgacgg cgcactcttc accggccgcc tctccctgcg 84420 

cacccacccg tggctggccg accacaccgt cctgggcacc gtcctgctcc cgggcaccgc 84480 

actggtggaa ctcgccgtcc gcgcgggcga cgagaccggc agcggccacc tcgaagaact 84540 

caccctcgcc gcgcccctga ccctccccga ggacggcgcc accctcctcc aggtccgcgt 84600 

cggatccgcc gacgacaccg gccgccgcac cgtcaccgtc cacgcccgcc ccgacgacac 84660 

cgccgaccgc acctggacgc tgcacgccac cggtgtgctc gccaccacgc caccggccgc 84720 

cgcggcgttc gacaccacgg tctggccgcc cgccgacgcc gaacccctca ccaccgacga 84780 

ctgctacgca cacttcacca cccaccgctt cgcctacggc cccgccttcc agggcctgcg 84840 

ggccgcctgg cgcgccggcg acgtgctgta cgccgaggtc gccctgccgg agtccgccac 84 900 

cgacgaagcg gccgccttcg gcctgcaccc ggcgctcctg gacgccggcc tgcacgccgc 84960 

gctcctcgcc gacgaccgcg acaccggact cccgttctcc tgggaaggcg tcactctgca 85020 

cgcctccggc gccaccgcgc tacgcgtccg gctcgccccg aacggcccca acggcctgtc 85080 

cgtcaccgcc gccgacccgg ccggcaaccc cgtcgccacc gtcacccgcc tgctcgcccg 85140 

ccccctggac gccgagcagt tgaccatcca cagcgccctg acccgcgacg cgctcttcca 85200 

cctggactgg accccggtcc cgcttcccga caccgccaac tccgcgccgc cggccctcct 85260 

cggcccggac accgccgtgc tcgccgacgc cctcggcgac ccggccgtcg cacgccacgc 8532.0 

aaccctcgac gacctcctgg ccggggacac caccccgccc gccacggtcc tcgtccccct 85380 

cggcgcccca ctcgacggcg acaccgcgca gc'acgcgcac gccctcaccc gcagcgcgct 85440 

gaccctcgtc cagcagtggc tcgccaccga ccgcctcgcc gactcccgcc tggtcttcgt 85500 

cacccacgga gccgtcgcca ccgacgacgc gccccccacc gacctggccg ccgccgcggt 85560 

ctggggcctg atccgctccg cgcagaccga gaaccccggc accttcaccc tcctcgacct 85620 

cgacaccgag cccgactcga ccaccgcgct cagccgcgcc ctgaccctcg acgaaccaca 85680 

gctcctcctc cgcgccggcc gcgcccgcgc cgcccgcctc acccgcaccc ccgcccccac 85740 

caccaccacc cacacgccgt ggtccgcgga cggaacggtg ttggtgacgg gtggtacggg 85800 
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tggtctgggt gggttggtgg cccggcatct ggtgcggtcg tgtggggtgc ggcatttgtt 85860 

gttgaccagt cgttctggtg tgggtgctgc gggtgcggcc gggttggtcg cggagttgga 85920 

gtcgttgggc gcgcgggttg tggttgcggc gtgtgatgtg ggtgatggct cggctgttgc 85980 

ggagttggtt gccggtgtgt cggagtcgta tccgttgtct gcggtggtgc atgcggctgg 8604 0 

tgtgttggat gacggtgtgg tgggttcgtt gacgccggag cggttggctg cggtgttgcg 86100 

tccgaaggtg gatggtgcgt ggaacctgca tgaggcgacg cgtggtctgg atctggacgc 86160 

gtttgttgtc ttctcgtctg ttgcgggtgt gttcgggggt gcgggtcagg ccaactatgc 86220 

ggcgggtaat gcgtttttgg acgcgttgat ggttcatcgg gtggctggtg ggttgcctgg 86280 

tgtgtcgttg gcgtggggtg cttgggatca gggtgtgggg atgacggcgg ggctgacgga 86340 

gcgggatgtc cgtcgtgctg ctgagtcggg tatgccgttg ttgacggttg atcagggtgt 86400 

ggcgttgttc gatgcggcgt tggcgacggg gagtgccgcg ttggtgccgg tccgtctgga 86460 

cctggccgca ctgcgcaccc ggggcgacat cgcaccgctc ctccgcggcc tcgtccgcgc 86520 

accgctgcgc cgcaccgcgg ccaccggcct cgccaccggc gcggacaccg gcctcgtcca 86580 

acggctcggc cgactcgacc acgcccaacg ccacgaggca ctgctcgaca tggtccgcag 86640 

cagcgccgcg ctcgtcctcg gccacgccga cggcaacgcc atcgacgccg aacgcgcctt 86700 

ccgcgacctc ggcttcgact cgctcaccgc ggtcgaactc cgcaaccgtc tgcgcaccgc 86760 . 

caccggcctg cacctgtcgg ccaccatggt cttcgaccac cccaccctgt ccgccctcgc 8682 0 

ggagcacctg cgggacgagt tgttcggcgc ggtcgagagc gaggtgcggg tgccggtcca 86880 

ggcactgccg ccgaccgccg acgatcccat cgtggtggtg ggcatggcct gccgtttccc 86940 

cggtggtgtg acctcgcccg aggacctgtg gcgcctggtc gacgacggca ccgacgccat 87000 

caccaccttc ccgaccaacc gcggctggga cctggacaac ctctacgacc cggaccccga 87060 

gcacttcggc acgtcgtaca cccgctccgg tggcttcctg cacgaggcgg gggagttcga 87120 

cccggcgttc ttcggaatga gcccgcgtga ggcgctggca accgactccc aacagcgtct 87180 

cctgctggaa tcctcctggg aggcgatcga gcgggccggc atcgacccgc tgaccctgcg 87240 

cggcagcgcc accggcgtct tcgccggcgt gatgtacagc gactacggga gcatcctcgg 873 00 

cggcaaggag ttcgagggct tccaaggcca gggaagtgcg ggcagcgtgg cctcgggccg 87360 

cgtctcctac gccctcggct tcgagggccc ggccgtcacg gtggacacgg cttgctcttc 87420 : 

ctctctggtc gccctgcact gggcggcgca ggcgttgcgg gcgggggagt gctcgttggc 87480 

gttggccggt ggtgtgacgg tgatgtcgac gccgagcacg ttcgtggagt tctcgcggca 87540 

gcggggtctg gcgcctgatg gtcgttccaa ggcgttcgcc gaggccgcgg acggcgtcgg 87600 

ctggtccgag ggcgtcggca tcctcgtcct ggagcgccag tcggacgcgg tgcgcaacgg 87660 

ccacgagatc ctcgccgtga tccgcggctc ggcggtcaac caggacggtg cgtccaacgg 87720 

cctgaccgcg cccaacggcc cgtcccagca gcgcgtcatc cgtcaggcgt tggccagtgg 87780 

cggcctgtcc acggccgacg tggacgccgt cgaggcgcac ggcacgggta cgacgctcgg 87840 

tgacccgatc gaggcccagg cgctcctggc cacctacggc cgtgaccgcg accccgagaa 87900 

ccccctgtgg ctgggctccc tgaagtccaa catcgggcac acccaggcag cggccggtgt 87960 

cgccggtgtc atcaagatgg tcatggcgat gcggcacggc gtgctgccgc agaccctgca 88020 

tgtcgacgcg ccgtcctcgc acgtcgattg gagcgtcggc gccgtcgaac tgctcaccga 88080 

gcagaccgcc tggccggaga ccggccgggt ccgtcgcgcc ggtgtctcct ccttcggcat 88140 

cagcggcacc aacgcccacg tcatcgtgga acagccggcg ctcgtcgaaa gcccggccgc 88200 

ggagccgagc ggacgcgaac ccggcgtcgt tccgctgccg ctgtccggaa agtcccccga 88260 

ggccctgcgc gaccaggccg cacgcctgct ggccgggttg gcggagcggc ccgcgctgcg 88320 

cccgctcgac ctcggctact cgctggcgac gacccgttcg gcgttcgacc accgggcggt 883 80 

ggtgctcgcc accgaccgcg ccgatgcggt ccgcgcgctg acggcgctcg ccgccgccga 88440 

cgcggatctc tccgccgtcg tcggcgacac ccgcacgggt cgtcacgcgg tgctgttctc 88500 

gggtcagggc tcgcaacgcc tgggcatggg gcgtgagttg tacgagcgtt tcccggtctt 88560 

• cgccgaggct ctcgatgtcg cgatcgacca cctggacgcc gccttgcccg cccaggccag 88620 
tctgcgtgag gtgatgtggg gcgacgatgt cgagctgctg gacgagacgg gttggacgca 88680 
gccggctctg ttcgccgtcg aggtggccct gttccggctg gtggagagtt ggggtgtccg 88740 
tccggacttc gtggccggtc attccatcgg tgagatcgcg gcggcgcatg tcgtcggggt 88800 
gttctcgctg gaggacgcct gccgtctggt ggccgcccgt gcgacgctga tgcaggcgct 88860 
gccgaccggc ggcgcgatga tcgcgatcca ggccgccgag gacgaagtca cccagcacct 88920 

• gactgacgac gtctcgatcg ccgccgtcaa cggcccgacc tccgtggtcg tctccggggc 88980 
cgagagcgct gcccgcacgg tggccgaccg gctcgcggag aacggccgca agacgacccg 89040 
gctgcgggtc tcgcatgcgt tccactcgcc gttgatggat ccgatgctgg cggagttccg 89100 
tgcggtggcc gagggcctgt cctacgccac cccgaccctc cccgtcgtct cgaacctgac 89160 
gggccggctg gccacggccg atgacctctg ctcggccgag tactgggcgc gccacgtccg- 89220 
cgaggcggtc cgcttcgccg acggcgtcag caccctggag aacgagggcg tcaccacgtt 89280 
cctggaactg ggaccggacg gcgtgctgtc cgccatggcc cagcagtcgc tcaccggcga 89340 
cgccgccacc gtcccggccc tccgcaagga ccgcgacgag gagacgtccg cgctcaccgc 89400 
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cctcgcccac 
cggcgccacc 
cggcaccctg 
gctgaacggt 
actgcagtca 
caccgcactg 
ggaactgacg 
ccgggtcggc 
gcacgcgacc 
cccggccgac 
caccgacgac 
gggcctgcgg 
gtccaccggc 
gcacgcctcc 
ggagggcgcg 
gggcaccgac 
cgccatcgac 
actggcccgc 
ggagaaccct 
cgccgcggcc 
cgccgccggc 
cgacgttccg 
gactgagtcg 
cccggccgag 
cgcacaggag 
cggcggtacc 
ccagtccgaa 
gcacgaccgc 
cgaaccgcag 
caccgcggcc 
gggcagcctc 
ccacgaggtg 
gttggggatg 
cgtcgaggtc 
caccggcagc 
ggactggtcc 
cctgaaggag 
tggtgtcggt 
ggccagtgag 
ctcctcccgc 
ggacgtcgtc 
cgacggcggc 
cgacggcctc 
catgctggcc 
cacctgggac 
cggcaagatc 
gggtggtacg 
gcggcatttg 
cgcggagttg 
ctcggctgtt 
gcatgcggct 
tgcggtgttg 
ggatctggac 
ggccaactat 
tgggttgcct 

ggggctgacg 
tgatcagggt 
ggtccgtctg 
cctggtcaag 
cgagcagctc 



ctccacacgg 
cgcgtggacc 
cccaccgcgc 
tcggtcgaac 
catccgtggc 
ctggaactgg 
ctcgccgcac 
gtcgccgacg 
gacgtgtcgt 
accggtttcg 
tgctacgcgc 
gccgcctggc 
gacgaggcga 
ctcgtcgccc 
accctctacg 
ggccgttcgg 
aacctcgtct 
gacgccctgt 
gtaccggaga 
accgtcgcgc 
atccacacca 
aagacggtcc 
acggacggaa 
gtcgcccaca 
cgcttcgccg 
gacgtcatgg 
gccccggaca 
acagccgccg 
ctcgccctgc 
gcgctcaccc 
aacggcctcg 
cgggtcgagg 
tatccgggtg 
ggaccggagg 
ttcggctcgc 
tgggagacgg 
ttgggtggtc 
atggcggcga 
ggcaagtggg 
accctcgact 
ctcaactccc 
cggttcctgg 
tcctaccagt 
gagctgatgg 
gtccggcacg 
gtgctcaccc 
ggtggtctgg 
ttgttgacca 
gagtcgttgg 
gcggagttgg 
ggtgtgttgg 
cgtccgaagg 
gcgtttgttg 
gcggcgggta 
ggtgtgtcgt 
gagcgggatg 
gtggcgttgt 
gacctggccg 
gcgcccatcc 
acccggctcc 



caggtctccg 
tgccgaccta 
acgccgcggc 
tcgccgaagg 
tggccgacca 
cgttccgggc 
cgctcgtcct 
acaccggccg 
ggacccagca 
acgccactgc 
gcttcacgac 
gcgccggtga 
ccgccttcgg 
acgagggcga 
cgaccggcgc 
tggccatcgc 
cgcgccgggt 
tcaccctgga 
acaccggcgg 
tggtcggcgc 
ccctccaccc 
tcatccccct 
tcgggacggg 
ccctgtccac 
gctcccgcct 
acgtggccgc 
ccttcgtcct 
ccgaacgggg 
gtgacggcgg 
cgccggccga 
ccctgacccc 
tgcgtgccgc 
atgatgtcgg 
tgaccggcct 
tcgccgtgga 
gtgcgtcggt 
tgcgggcggg 
tccagatcgc 
acgtgttgcg 
tcgaggcggc 
tcgccggtga 
agatgggcaa 
ccttcgacct 
acctcttccg 
ccaaggacgc 
tgccccgctc 
gtgggttggt 
gtcgttctgg 
gcgcgcgggt 
ttgccggtgt 
atgacggtgt 
tggatggtgc 
tcttctcgtc 
atgcgttttt 
tggcgtgggg 
tccgtcgtgc 
tcgatgcggc 
cactgcgcac 
gccgcgcagc 
agcgcgccga 



cgtcgactgg 
cgccttccag 
cgtcggcctc 
cgaaggggtg 
cgccgtcatg 
cggcgacgag 
gcccgagcgc 
ccgtaccgtc 
cgcgaccggc 
ctggccgccc 
gctcggcttc 
cgtgctgtac 
tctgcacccc 
ggagagcaac 
caccgcgctg 
cgtggccgac 
ctccggcgac 
ctggaacccc 
gggccacgcc 
ggacggcacc 
cgacctcacc 
caccggaacc 
ggccgccgag 
cgccgcactc 
ggcgttcgtc 
cgccgcggtc 
gatcgaccgt 
ccaactgctc 
cgtgctcgcc 
ccgggcctgg 
gtatccggcg 
gggcctgaac 
atcgttcggt 
ggcccccggc 
cgacgcgcgg 
gccgttggtg 
ggagaaggtg 
ccggcatgtc 
ctcgctcggc 
cttcgccgaa 
cttcgtcgac 
gaccgacatc 
cgcctgggtg 
caccggcgca 
gttccgcttc 
ctggaagccc 
ggcccggcat 
tgtgggtgct 
tgtggttgcg 
gtcggagtcg 
ggtgggttcg 
gtggaacctg 
tgttgcgggt 
ggacgcgttg 
tgcttgggat 
tgctgagtcg 
gttggcgacg 
ccggggcgac 
cgccaccaca 
gcgacgggac 



gcggcgttct 

cacgccacct 

accgccgccg 

ttgttcaccg 

ggacaggtcc 

gccggttgcg 

ggtgcggtac 

accgtccact 

accctgacca 

gccgacgccg 

gcctacgggc 

gccgaggtgg 

gcactgctcg 

ggcggactgc 

cgcgtccggc 

accgccggtc 

cagttgaccg 

gtaccggaga 

caggaccagg. 

gcgatcgccg 

accctcgcca 

ggaaccggaa 

tcggacgcgt 

gccctcgtcc 

acgaccgggg 

tggggcctgg 

gaccccggcc 

ctacgggcac 

gcccgcctgg 

cggctcgaca 

gcactggcgc 

ttccgtgacg 

tcggaggcgg 

gaccaggtca 

cgcctcgccc 

ttcctcaccg 

ctggtgcatg 

ggtgccgagg 

gtggccgacg 

gtcgccggcg 

gcctcgatgc 

cgcgccgcgg 

gtgccggaaa 

ctgcggccac 

atgagcatgg 

gagggaacgg 

ctggtgcggt 

gcgggtgcgg 

gcgtgtgatg 

tatccgttgt 

ttgacgccgg 

catgaggcga 

gtgttcgggg 

atggttcatc 

cagggtgtgg 

ggtatgccgt 

gggagtgccg 

atcgcaccgc 

cccggcgaca 

accctcctcg 



tcgccc 



tdgccggcag 
actggcccac 
agcacccgct 
gacggctgtc 
tgctgcccgg 
accgcgtcga 
agacccaggt 
cccggcccga 
tgggctccgc 
aacccctcgc 
cggtcttcca 
ccctggcgga 
acgccgccct 
cgttctcctg 
tgaccccgac 
gtccggtcgc 
gcgccgcggg 
acctcgtacc 
acggccggcc 
ccgacctgac 
cgaccgacgc 
ccggcaccgg 
ccgccccctc 
aggagtggac 
cgacggccgc 
tccgatccgc 
cggccggcac 
tgcacaccga 
cccgcttcga 
gcacggccaa 
cgctcaccgg 
tgctcaacgc 
ccggtgtggt 
tgggcatgat 
gcctgcccga 
cgtactacgc 
ccggtgccgg 
tgttcgccac 
accacatcgc 
accgcggcct 
ggttgctcgg 
actccgttcc 
ccatcggcac 
tcccggtccg 
ccaagcacat 
tgttggtgac 
cgtgtggggt 
ccgggttggt 
tgggtgatgg 
ctgcggtggt 
agcggttggc 
cgcgtggtct 
gtgcgggtca 
gggtggctgg 
ggatgacggc 
tgttgacggt 
cgttggtgcc 
tcctccgcgg 
ccggactcgc 
cgctcgtccg 
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89460 

89520 

89580 

89640 

89700 

89760 

89820 

89880 

89940 

90000 

90060 

90120 

90180 

90240 

90300 

90360 

90420 

90480 

90540 

90600 

90660 

90720 

90780 

90840 

90900 

90960 

91020 : 

91080 

91140 

91200 

91260 

91320 

91380 

91440 

91500 

91560 

91620 

91680 

91740 

91800 

91860 

91920 

91980 

92040 

92100 

92160 

92220 

92280 

92340 

92400 

92460 

92520 

92580 

92640 

92700 

92760 

92820 

92880 

92940 

93000 
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cgaccaggcc gcgatggtcc tcggccacac ctcgggcgac ggcgtcgacc cgrcccgcgc 93060 

cttccgcgac ctcggcttcg actcgctcac cgcggtcgaa ctccgcaacc gcatcggcgc 93120 

ggccaccggc ctgcggctac cggccacggc cgtcttcgac taccccaccg ccgatgccct 93180 

cgccgcacac ctgctcaccg aactgctcgg ccccgacgcc gagtcggacc ccgacgagcc 93240 

cggcgacccc accgcgggac cgaccgacga ccccatcgtc atcatcggca tgagctgccg 93300 

cttccccggc gacatcggct cgccggagga cctgtggcgc ctgctcggcg acggcgccga 93360 

cgtcgtcacc gacttcccga ccaaccgcgg ctgggacctg gacaacctct acgaccccga 93420 

ccccgcgcac gccggcacct cgtacgcccg caccggcggt ttcctgcacg acgccgccga 93480 

cttcgacgcc gacttcttcg gcatgagccc ccgcgaggcc atggccacgg actcccagca 9354 0 

gcgcctgctg ttggagtcct cgtgggaggc gatcgagcgg gccggcatcg acccgctgac 93600 

cctgcgcgac agccgcaccg gcgtcttcgc cggcgtcatg tacagcggct acggcacccg 93660 

cctcgacggc gccgaattcg aaggcttcca ggggcagggc agcgcactga gcgtggcctc 93720 

cggccgggtc tcctacacct tcggcttcga aggcccggcc atgacggtcg acaccgcctg 93780 

ctcctcctcg ctggtcgccc tgcacctcgc cgcacaggca ctccgcggcg gtgagtgcac 93840 

cctcgccctc gccggtggtg tcaccgtgat gtccatcccg gacaccttca tcgagttctc 93900 

ccggcagcgc ggactggccc ccgacggccg ctccaagccg ttctccgagt ccgccgacgg 93960 

cgtcggctgg tccgagggcg tcggaatgct gctcctggag cgccagtcgg acgccgtgcg 9402 0 

caacggccac cagatcctgg ccgtggtgcg cggctcggcg gtcaaccagg acggtgcgtc 94 080 

caacggcctg accgcgccca acggcccgtc ccagcagcgg gtgatccgtc aggcgttggc 94140 

cagcggcggc ctgtccacgg ccgacgtgga cgccgtcgag gcgcacggca cgggcaccac 94200 

gctcggtgac ccgatcgagg cccaggccct cctggccacc tacggccgcg accgcgaccc 94260 

cgagaacccg ctgctgctcg gttcgatcaa gtccaacctc ggccacaccc aggcagccgc 94320 

cggtgtcgcc ggcgtcatca agatggtcat ggcgatgcgg cacggcgtgc tgccccgcag 94380 

cctgaacatc accgagccgt cctcgcacgt cgattggagc gccggcgccg tcgaactgct 94440 

caccgagcag accgcctggc cggagaccgg ccgggcccgt cgcgccggta tctcctcctt 94500 

cggcatcagc ggcaccaacg cccacgtcat cctggagcag ccggaggccg cgcggcactc 94560 

ggcgccggaa gaagccgaca cggcggaggc agccgccaag gcgccggcca ccgcgcacct 94620: 

gcccgtaatg ccgtgggcac tgtccggcaa gacgccggag gccctgcgtg cccaggccgc 94680 

acgcctcctc gcccacctcc agcagcgccc cgaactcgca cccgccgaca tcgccctgtc 94740 

cctcgccacc cagcgctccc agttcaccca ccgggcagtc gtcctgagca ccgaccgtga 94800 

cgaggcgacc cgcgcgctgt ccgccctcgc caccaccgcc gcgtccgacc cctcggccct 94 860 

caccggcacg gtcaccatgg gacgttgcgc ggtgctgttc tcgggtcagg gctcgcaacg 94 920 

tctgggcatg gggcgtgagt tgtacgagcg tttcccggtc ttcgccgagg ctctcgatgt 94980 

cgtgatcgat cacctggacg ccgccttgcc cgcccaggcc ggtttgcgtg aggtgatgtg 95040 

gggcgacgat gtcgagttgc tgaacgagac gggttggacc cagcccgcgc tcttcgccat 95100 

cgaggtggcg ctgtttcggc tggtggagag ttggggtgtc cgtccggact tcgtggccgg 95160 

tcattccatc ggtgagatcg cggcggcgca tgtcgtcggg gtgttctcgt tggaggacgc 95220 

gtgccgtctg gtggccgcgc gggcgacgct gatgcaggcg ttgccggccg gtggcgcgat 95280 

gatcgcggtc caggcgaccg aggacgaagt catcccgcac ctgaccgacg aggtggcgat 95340 

cgcggccgtc aacggcccga cctccgtggt gatctcgggc gcagaagagg ccacgcagac 95400 

cgtggcacaa cacttcgccg accaggggcg ccggacgacc gcgctgcggg tctcgcatgc 95460 

gttccactcg ccgctgatga tgctggcgga gttccgtgcg gtggccgagg gcctgtccta 95520 

cgccaccccg accctccccg tcgtctcgaa tctgacgggc caggtggcca cggccgacga 95580 

actctgctcg gccgagtact gggtgcgcca cgtccgtgag gcggtccgct tcgccgacgg 95640 

tgtgacggcc ctcgaagccg agggcgtgcg gaccttcctg gaactcggcc cggacggcgt 95700 

cctcgccgcc atggccaggg aaaccgtcgc cgacgacacg gtcaccgtcc ccgtcctccg 95760 

caggaacatg cccgaggaac ggaccctgct caccgcactc ggccggctcc acaccaccgg 95820 

aaccccgatc gactgggccg ccctcctggc cccgaccggc gcccgcccgg tggacctgcc 95880 

gacatacgcg ttccaacacc gtcccttctg gccctccggc ccccgcgaca ccgcggatgc 95940 

cgccgccgtc ggcatcgccg gcgcgagcca cccgctcctc aacggcatcg tcgaactcgc 96000 

cgacgaagag ggcctgttgt tcaccggacg gctgtcactg cagtcgcatc cgtggctggc 96060 

cgaccacgcc gtcatgggac aggtcctgct gcccggcacc gcactgctgg aactcgccct 9612 0 

gcgcgccggc gacgaggtcg gctgtgacca cgtcgaggaa ctgacgctcg ccgcaccgct 96180 

cgtcctgccc gagcgcggcg cggtacagac ccaggtccgg gtcggcgtcg ccgacaccac 96240 

cgggcgccgc accgtcacga tccactcgcg tcccgcacgc gccacgacca ccgacagtga 963 00 

cacccacacc ggcaccgaca ccccgtggac ccaacacgcc accggcgtcc tcgtcgccgg 96360 

cctgccggcg acggcaaccg tcccgttcga tgccaccgtg tggccgccgg cgcacgccga 96420 

acccgttgac ctggcggact tctacgcgtc ccgggccggc gaaggattcg gctacgggcc 964 80 

cgctttccag ggcctgcgag cggcctggcg ccgcgacggc gaggtgttcg ccgatgtcgc 96540 

actgccggag gccggccgta ccgaagccga ggcgtacggg ctgcatccgg cactgctcga 96600 
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cgccggactg 

cgtgccgttc 

ccgactcggc 

accggtcgcg 

caccgcgggc 

ggcgtgccag 

aacctccgaa 

cgtccgcacg 

cgcctccgac 

cacccgccga 

gaagttcgtc 

gtggggtctg 

ggatccggat 

gcagcttgcg 

cgaggtcggt 

gttctcgggt 

ggcgcgtcat 

tgaacgtgcc 

gcgggtggtt 

tgcggtgtcc 

gaccggggag 

tgaggcgacc 

cttcggcagt, 

gacgcggcgc 

gaccgatggc 

agtgccaccg 

tgacgccacc 

gccgcccttg 

ggcaaccgcc 

agtcctcctg 

cgacgttcat 

actgcgcaac 

ctatccgacc 

cgaggtggcg 

gggcatggcc 

cacggacggc 

cctctatcac 

gcatgaggcg 

gaccgattcc 

tattgatccg 

cgattacagc 

gccgagtttg 

ggtggatacg 

tagtggtgag 

gtttgtggac 

ggatgcggcc 

gtcggacgcg 

ccaggatggt 

ccggcaggcg 

tggtacgggt 

gcgggatcgt 

tacgcaggct 

tgtggtgccg 

tgcggtggag 

gggtgtctcc 

ggccgcgcgg 

gctttccggc 

cgaggaacgc 

gacctttgaa 

gtcggcgatt 



cacgcagcct 

tcgtggcgcg 

cgcgactccg 

tccgtacagg 

ctcgcccgcg 

ccggacgaca 

ctcacctccg 

accctgtcga 

cagaccggca 

gccctggccc 

ttcgtgaccc 

gtgcgttcgg 

ggtgcggtgg 

gtgcgcggtg 

gcgggtgctg 

gagggtgcgg 

ctggtggccg 

gtgggtgctg 

gcgtgtgatg 

gcggtggttc 

cggttgtccg 

cgcggcctgg 

cccggccagg 

cgggcggagg 

acatcgggca 

ctgaccgcgg 

tgcgtcccgg 

ctgcggtccc 

accgggctgc 

gacctcgtgc 

ccggcacgtg 

cgcctcaaca 

gtcgaagtgc 

accgtgcagc 

tgccgctacc 

gtcgacgcgg 

ccggaccccg 

ggggagttcg 

cagcagcggt 

gtgagtttgc 

gcgatgttgg 

gcgtcgggtc 

gcgtgttcgt 

tgtgggttgg 

tttgctcggc 

gatggtgtgg 

gtgcgcaatg 

gcgtccaatg 

ttggccagtg 

acgacgctcg 

gagcctgagc 

gctgcgggtg 

cggacgttgc 

ctgctcagtg 

tccttcggca 

cgtccggtga 

aagacaccgg 

ccggagcttc 

caccgtgcgg 

gctgccgacg 




ggctcgtcgc 

gcgtcttcct 

acggaacgct 

ccctctccat 

acgcgctgtt 

cggtgaccgt 

agctcaccgc 

ccgatgaacc 

ccgcagaagc 

tcgtacagac 

gtggtgcgac 

cgcagtcgga 

gtgcggctgc 

atgtgttgcg 

atggcaccgg 

tcctggtcac 

agtatggggt 

gggagttggt 

tgaccgatcg 

atgcggctgg 

cggtgctgcg 

atctggacgc 

ccaactacgc 

gactgcccgg 

tgctcgcgga 

agcaaggact 

tccgcctgga 

tgatccgagg 

gggaacggct 

gcggccaggt 

ccttcaggga 

ccgtcaccgg 

tggtctccta 

cggccgcggt 

ccggcggggt 

tgtccccctt 

accatctcgg 

atccggggtt 

tgttgttgga 

ggggtagtcg 

cgagtccgga 

gggtggccta 

cgtcgttggt 

cgttggccgg 

agcggggttt 

gctggtccga 

gtcacgagat 

gtttgacggc 

gtggtctgac 

gtgatccgat 

ggccgttgtt 

tggcgggcgt 

atgtggatgc 

agcaggcggc 

tcagcggcac 

tggagacgaa 

aggccctgcg 

gcctcgtcga 

tggtactcgc 

aggccgatgc 



cccggacggg 

ggccgcttcc 

gagcctggcc 

gcgcaccgtc 

ccgcctggac 

gatcccggcg 

ggccctgcgt 

cgcgcccgcg 

ggcacccgtc 

ccgcctgcaa 

ggtcgggcgt 

gaatccgggt 

gctcgtcgct 

ggtcgcgcgt 

ggatggcgtc 

tggtggtacg 

gcgggatctg 

ggcggagctt 

tgccgcggtg 

tgtgctggat 

gccgaaggtg 

gttcgtcgtc 

ggccgcgaac 

cctgtcactc 

cgccgaggcc 

ggcactgttc 

cctgtcggca 

ccgctcgcgc 

cgtcggactg 

ggccctggtc 

gttgggcttc 

tctgcggctc 

cgtcctggac 

tgcggtggcg 

cgcctctccg 

cccgaccaac 

tacctcctac 

cttcgggatg 

gtcgtcgtgg 

gacgggtgtg 

gttcgagggt 

cacgttgggg 

ggcgatgcac 

tggtgtgacg 

gtcgccggat 

gggcgtcggc 

tttggctgtg 

gcccaatggt 

ggccggtgac 

cgaggcgcag 

gttgggttcg 

tatcaagatg 

gccttcttcg 

ttggccggag 

caacgcgcac 

caccgtggag 

cgcccaggcg 

cgtgggaatg 

cgccgaccgt 

tgccgctgcc 



gagcccacac 

ggtgcctcct 

atcgccgaca 

tcggtgacgg 

tgggcctcgg 

gtcgcggtcg 

gcggcgggcg 

ctgatcgcgc 

ccggcggccg 

gagcagcact 

gatgtggctg 

tgttttgctc 

gcgttggtca 

ctggtgcggc 

gggggtggct 

ggtggtctgg 

ctgttggtca 

gcgggtgtgg 

gtggagttgg 

gacggcatgg 

gatgctgtct 

ttctcctctc 

gccttcctgg 

gcatggggac 

gatcgcctga 

gacgcggccc 

ctgcgtgccc 

cgcgccgccg 

aacccggtcg 

ctcggccacg 

gactccctca 

cccgcgacca 

gagttgctgg 

gacgatccga 

gacgacctgt 

cgtggttggg 

acgcgctcgg 

agtccgcggg 

gaggcgatcg 

ttcgcggggg 

ttccagggca 

ttggaaggcc 

tgggcgatgc 

gtgatgtcga 

ggtcggtgca 

gtgttggtcc 

gtgcggggtt 

ccgtcgcagc 

gtggatgtgg 

gcgttgttgg 

gtgaagtcga 

gtgttggcga 

catgtggact 

acgggtcggg 

gtgattctgg 

ccgtccacgg 

gcgaaactgt 

tccctggtca 

gccgacgccg 

accggccgcg 
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ggacgggcag 
cggtccgcgt 
ccaccggtgc 
ccctgagcgc 
ccccggagcc 
tcggtacgga 
ccgacgtcga 
tgccgctcgt 
tgcacgacct 
tcgcggacac 
ctgctgcggt 
tggtcgatct 
gtggtgagcc 
ggccgctcac 
ctggtgtgtc 
gtgcggtgtt 
gtcgcagtgg 
gtgcgcgggt 
ttggcgggca 
tgggtgcgtt 
ggcatctgca 
tcgccggggt 
acgcgctgat 
cgtggtcgct 
cccgttcggg 
tggcgaccgg 
agggtgaggt 
ccgcggaatc 
agcgacagga 
ccgacgccga 
cctcggtgga 
tggtgttcga 
gcacggatgc 
tcgtcatcgt 
ggaggctcgt 
acgtcgaatc 
gtgggttcct 
aggcgttggc 
agcgggccgg 
tgatgtacag 
gtgggagttc 
cggcggtgac 
aggcgttgcg 
cgcctgcggt 
aggcgtttgc 
tggagcggca 
cggcggtcaa 
agcgggtgat 
tggaggcgca 
cgacgtatgg 
atctggggca 
tgcggcatgg 
ggtccgaggg 
tgcggcgggc 
agcgtccgga 
tgccgtgggt 
tgtcgtcgat 
ccggccgctc 
cccgcgcatt 
tcggcgcggg 



96660 
96720 
96780 
96840 
96900 
96960 
97020 
97080 
97140 
97200 
97260 
97320 
97380 
97440 
97500 
97560 
97620 
97680 
97740 
97800 
97860 
97920 
97980 
98040 
98100 
98160 
98220 
98280 
98340 
98400 
98460 
98520 
98580 
98640 
98700 
98760 
98820 
98880 
98940 
99000 
99060 
99120 
99180 
99240 
99300 
99360 
99420 
99480 
99540 
99600 
99660 
99720 . 
99780 
99840 
99900 
99960 
100020 
100080 
100140 
100200 
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tcgtcacgcg 
gtacgagcgt 
cgccttgccc 
gaacgagacg 
ggtggagagt 
ggcggcgcat 
tgcgacgctg 
ggacgaagtc 
cgcactggtc 
cgaggggcgt 
tccgatgctg 
ccccgtcgtc 
gtactgggtg 
agccgagggc 
aggagcctcg 
ggaaccggcg 
gcccgtgctc 
gcgtgggtgg 
gcttgggtct 
ggtggtgttg 
gatggggcgg 
tgaggtgggg 
gcgtggtggg 
tgtgggggtg 
cggtgtgttg 
gccgccgcag 
tgggttcggt 
actcttcgca 
tttcgggttg 
tgaaggtcaa 
cttcgcctcg 
gtcggtgacc 
ccgtcgcctc 
cggcgtccaa 
catcggaacc 
cacgaccacg 
agtgtccgtc 
cgtgaacgac 
cgtagacgcg 
cgacgagcgt 
ggatgtcgcc 
caccttcggc 
tctcacctcc 
gacccgcctg 
ggtgctgatc 
caggtacggc 
cacggagttg 
cgtggccgat 
gagggcggtc 
ggagcggctc 
gacccgcggc 
cggcgccggc 
gcgccgggca 
cggaatgacg 
gccgctctcc 
cagcacgcca 
gatcgccatc 
cgcgcagggt 
ggcggccggc 
cgacgagcgg 



gtgctgttctT^gggtcaggg 
ttcccggtct tcgccgaggc 
gcccaggccg gtttgcgtga 
ggttggacgc agccggctct 
tggggtgtcc gtccggactt 
gtcgccgggg tgttctcgct 
atgcaggcgt tgccggccgg 
accccgcatc tgaccgacga 
gtatcgggtg tggaggatgc 
cgcacgaccc gactccatgt 
gcggagttcc gtgtggtggc 



tcgaatctga 
cgccacgtcc 
gtgcggacct 
ctcaccgaat 
gcactcgccg 
ttcgctggtg 
ttctggccgg 
gcggggcatc 
acgggtcgtc 
gtgtttgttc 
tgtggtcgtg 
gtgcgggttc 
tattcgtgtc 
gcctctggtg 
ggtgcggtgt 
tatggcccgg 
gaggtcgccc 
cacccggcgt 



cgggccaggt 
gcgaggcggt 
tcctggaact 
cctccctcgc 
ccctggccca 
tgggtgcggg 
ttggtcgggt 
cgttgttggg 
tgtcgttgtc 
ctggtacggc 
ttgaggagct 
aggttgctgt 
ctgatggtgt 
tggctgacca 
cggtggatgc 
tgttccaggg 
tgtcggacga 
tgctggatgc 



tccgccgatg gtgggcctgc 
ggtgcgacgg ctttgcgcgt 
gcggtggatc cgaccggtgc 
accctcgatg aggtcaacgc 
tggaccacgg tcccgagcac 
gaccacctgg ggctcgccga 
accgcggccg cgtacgagag 
cctgacgtca cactcatcgg 
cacgacgcca cggtggccgg 
gctcgccgcc tcaccgccga 
ctggcggcaa ggcgcctggt 
gcggcggccg tccagggcct 
ctcctcgacc tcgacgggac 
gacgaaccgc aactgcttct 
gcgtcgcccg ccgacacggc 
accggtggta ccggcgggct 
gtccgcaatc tcctgctcgt 
gtcgccgagc tgacggcgca 
ggcgatgcgg tggcggcgtt 
gtccacacgg ccggtgtgtt 
gccaccgtcc tgcggcccaa 
ctggacctgg acgcgttcgt 
caggccaact acgccgcggc 
gcgggcctgc ccggactctc 
ggcatgctgt cggacgccga 
gcggagcagg gcctcgccct 
gacagggcag ccggcagcgc 
ccggccgcgg ccctcgtcgc 
gaggtccccg cgattctgcg 
ggttccgtca ccgtggccgg 
cgccaggaac ttctcgaact 



tgctcaacgt 
tctcgatgtc 
ggtgatgtgg 
gttcgccatc 
cgtggccggt 
ggaggacgcc 
cggtgcgatg 
cgtagcgatc 
cgccgtcgag 
gtcgcatgcg 
ggagggcctg 
ggccacggcc 
ccgcttcgcc 
cggcccggac 
ggtaccgctg 
gttgcacatc 
gcgggtggag 
tggtgttggt 
tgctgcggtg 
gtcgcatggt 
gttgctggag 
gacgttggcg 
ggatgctcct 
gggtcaggcg 
ggtcggtggg 
tgagggctgc 
gttgcgtgcg 
ggttgctgag 
ctcgctccat 
gttgccgttc 
gcggttggcg 
gccggtgatt 
atctcacacc 
cccggccgcc 
agcgctcagc 
ccttgacgcg 
tctcaccacc 
ccaaggcacc 
ggccctgcgc 
cttcgtgacc 
ggtgcgctcc 
cgaggcgtcg 
gcgcgacgga 
cgtgcccacg 
cggcgcgcag 
cagccggcgc 
cggtgccgag 
ggtcgccggc 
ggacgacgga 
ggcggatgcc 
cgtcttctcc 
caacgccttc 
gttggcctgg 
ggccgaccgc 
cttcgacgcg 
cgccgccagc 
accggtccgg 
cggactggtg 
actcgtcaac 
ggtccgcact 



ctgggcatgg 
gtggtcgacc 
ggcgacgatg 
gaggtggcgc 
cattccatcg 
tgccgcctgg 
atcgcggtcc 
gccgccatca 
atcggggcgc 
ttccactcgc 
tcctatgctg 
gacgaactgt 
gacggggtga 
ggcgtcctcg 
ctccgtaagg 
gccggcgcgc 
ttgccgacgt 
ggtgatgtgg 
gagttggctg 
tggttggctg 
atggtgatgc 
gcgccgttgg 
gatgctgcgg 
gtgtggtcgc 
ttcggtgacg 
tacgagctgt 
gtgtggcgtc 
agcgctgata 
gcctcgcttc 
gcgtgggagg 
ccggcgggtg 
tccatcgacg 
cagctgagcg 
gaccacccgt 
agttcctctg 
ctgatcgcgg 
gaggacgcca 
atcggagccg 
acgatccagg 
cgcggcgcgg 
gcccagaccg 
accgcggtcc 
cacctgcacg 
gagtggaacg 
ttcgcacggc 
ggccccgatg 
gtggccgtgc 
gtgccggatg 
gtgatcggct 
gcctggcatc 
tccgtcgcag 
ctggatgcgc 
gggccgtggg 
ctcgcccgct 
gcactcgctc 
acgtcgggaa 
ctcgacctgg 
cggacccgca 
cgcctgtccg 
caggcagccc 



ggcgt( 



gg^tgagtt 
acctggacgc 
cggagttgct 
tgttccggct 
gtgagatcgc 
tggccgcccg 
aggcgaccga 
acgggccgaa 
ggttcgcggc 
cgttgatgga 
ctccgtccct 
gctcggccga 
cggccctcga 
ccgccatggc 
accggccgga 
gcgtcgattg 
atgcgttcca 
gtgctgtggg 
cgggtgcggg 
atcatgcggt 
gtgctggtga 
tgttgcctga 
gtcgtcgtgg 
agcatgctgt 
gtggtgtgtg 
ttgcggatgc 
gcggcgagga 
cggcgaccgg 
tctcctccct 
gtgtttccct 
agcatgcggt 
cgcttcgtac 
atgcgctctt 
cggtagccat 
ctggcgcgac 
cggggcccga 
tcgctcagta 
gcgcggcggc 
catggttggc 
ccgacgggca 
agaacccggg 
tcggcgaggc 
ccgccaggct 
cggacggcac 
acctcgtcga 
ccccgggaac 
aggcatgtga 
agcacccgct 
cgctcaccga 
tgcacgaagc 
gggtcttcgg 
tgatggccca 
accagaccgg 
ccggcatccc 
ttgccggaac 
ccggcgacac 
cggcgctggc 
cccgccgtac 
ggctcaccgc 
tcgtcctcgg 
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gcacgccgat 
gctgaccgcc 
aaccctggtc 
gttcggcgcg 
cgatcccatc 
ggacctgtgg 
cggctgggac 
ccgctccggt 
cccgcgtgag 
ggcgatcgag 
cgccggcgtg 
ccaaggccag 
cgaaggtccc 
ggcagcccag 
gatgtccacg 
tcgttccaag 
cctcgtcctg 
ccgcggctcg 
gtcccagcag 
ggacgccgtc 
gctcctggcc 
caagtccaac 
catggcgatg 
cgtcgattgg 
cggccgggtc 
catcctggaa 
gccggtcgcc 
gtcacccgag 
cgaaccgcgc 
ccgcgcgatc 
ggccctggcc 
ttcggcagtg 
cggccgtttc 
cttgcccgcc 
cgagacgggt 
ggagcgttgg 
ggcgcatgtc 
gacgctgatg 
cgaagtgacc 
cgtggtgatc 
ggggcgccgg 
gatgctggcg 
cgtcgtctcg 
ctgggtgcgc 
cgagggcgtg 
gcagtccctc 
gtccacggcc 
ggacttcttc 
tgggtggttc 
tgggtctgcg 
ggtgttgacg 
ggggcgggtg 
ggtggggtgt 
tggtggggtg 
gggggtgtat 
tgtgttggcc 
gccgcagggt 
gttcggttat 
cttcgcagag 
cgggttgcac 



ccggcgtccg^ggactccac 
gtcgagctgc gcaaccggct 
ttcgactacc cgaacaccga 
gtcgagagcg aggtgcgggt 
gtggtggtgg gcatggcctg 
cgcctggtcg acgccggcac 
ctcgaatcgc tctacgaccc 
ggcttcctgc acgaggcggg 
gcgctggcaa ccgactccca 
cgggccggca tcgacccgct 
atgtacagcg actacgggag 
ggaagtgcgg gcagcgtggc 
gccgtcaccg tggataccgc 
gcccttcggg cgggtgagtg 
ccaggcacgt tcgtggagtt 
gcgttcgccg aggccgcgga 
gagcgccagt cggacgccgt 
gcggtcaacc aggacggtgc 
cgcgtcatcc gtcaggcgtt 
gaggcgcacg gcacgggtac 
acctacggcc gcgaccgcga 
ctcggccaca cccaggcagc 
cggcacggcg tgctgccgca 



agcgtcggcg 
cgtcgcgccg 
cagccggagg 
atcaagccgt 
gccctgcgcg 
tcgatcgaca 
gtgctggtcg 
tccggtgtgg 
ctgttcacag 
ccggtcttcg 
caggccggtt 
tggacccagc 
ggtgtccgtc 
gccggggtgt 
caggcgctgc 
ccgcacctga 
tcgggcgcag 
acgaccgcgc 
gagttccgtg 



ccgtcgaact 
gtgtctcctc 
ccgtgcagcg 
cggcggaacc 
cccaggccgc 
tcggccactc 
acgatgcgaa 
ccgatcccgc 
gtcagggtgc 
ccgaggctct 
tgcgtgaggt 
ccgcgctctt 
cggacttcgt 
tctcgctgga 
cgaccggcgg 
ccgacgaggt 
aagaggccac 
tgcgggtctc 
cggtggcgga 



aatctgacgg gctggctggc 
cacgtccgcg aggcggtccg 
cggaccttcc tggaactcgg 
gccggcgaag ccgtcaccgt 
ctgacggccc gagcgcatct 
gctggtgtgg gtgcggggcg 
tggccggttg gtcgggttgg 
gggcatccgt tgttgggtgc 
ggtcgtctgt cgttgtcgtc 
tttgctcctg gtacggcgtt 
ggtcgtgttg aggaactgac 
cgggttcagg ttgctgtgga 
tcgtgtcctg atggtgtggg 
tctggtgcgg ctgaccaggt 
gcggtgtcgg tggatgctga 
ggcccggtgt tccaggggtt 
gtcgccctgt cggacgaggt 
ccggcgttgc tggatgcctc 



cgcacagttc 

gagcacggcc 

tgccctcgcg 

gccggtccag 

ccgtttcccc 

cgacgccatc 

ggaccccgca 

ggagttcgac 

acagcgtctc 

gaccctgcgc 

catcctcggc 

ctcgggccgc 

ctgctcctcc 

cacgctcgcg 

ctcgcggcag 

cggcgtcggc 

gcgcaacggc 

gtccaacggc 

ggccagtggc 

gacgctcggt 

ccccgagaac 

ggccggtgtc 

gaccctgcat 

gctcaccgag 

cttcggcatc 

cctggcaccg 

gtccctggtg 

acgcctccgt 

actggccgtc 

ggccccggcg 

cgtcgtctcc 

tcaacgtctg 

cgatgtcgtg 

gatgtggggc 

cgccgtcgag 

ggccggtcat 

ggacgcctgc 

cgcgatgatc 

ggcgatcgcg 

gcagaccgtg 

gcatgcgttc 

aggactgtcc 

cacggccgac 

cttcgccgac 

cccggacggc 

gcccgtcctg 

ccacacccgc 

ggtggagttg 

tgttggtggt 

tgcggtggag 

gcatggttgg 

gctggagatg 

gttggcggcg 

tgctcctgat 

tcaggcggtg 

cggtgggttc 

gggctgctac 

gcgtgcggtg 

tgctgagagc 

gctccatgcc 



cgtgacctcg 

accggcctgc 

gagcacctgc 

gcactgccgc 

ggtggtgtga 

accaccttcc 

cacctcggca 

ccggcgttct 

ctgctggaat 

ggcagcgcca 

ggcaaggagt 

gtctcctaca 

tccctggtcg 

ctcgccggtg 

cggggtctgg 

tggtccgagg 

cacgagatcc 

ctgaccgcgc 

ggcctgtcca 

gacccgatcg 

ccgctgctgc 

gccggtgtca 

gtcgacgcgc 

cagaccgtct 

agcggcacca 

ggagcagcag 

ccgtgggcgc 

gacttcctgg 

acacgctcgc 

gacagcctgg 

gacgcggtat 

ggcatggggc 

gtcgaccacc 

gacgatgtcg 

gtggcgctgt 

tccatcggtg 

cgtctggtgg 

gcggtccagg 

gccgtcaacg 

gcacaacact 

cactcgccgc 

tacgccaccc 

gaactgtgct 

ggcatcacca 

atcctgtccg 

cgcaaggacc 

ggactgatcg 

ccgacgtatg 

gatgtgggtg 

ttggctgcgg 

ttggctgatc 

gtgatgcgtg 

ccgttggtgt 

gctgcgggtc 

tggtcgcagc 

ggtgacggtg 

gagctgtttg 

tggcgtcgcg 

gctgatacgg 

tcgcttctct 



gCTtcgactc 
gcctgaccgc 
gggacgagtt 
cgaccgccga 
cctcgcccga 
cgaccaaccg 
cctcctacac 
tcggaatgag 
cctcctggga 
ccggcgtctt 
tcgagggctt 
ccctcggctt 
ccctgcacct 
gtgtgacggt 
cgcctgatgg 
gcgtcggcat 
tcgccgtgat 
ccaacggccc 
cggccgacgt 
aggcccaggc 
tcggttcgat 
tcaagatggt 
cgtcctcgca 
ggccggagac 
acgcccacgt 
agaccgtgga 
tgtccggcaa 
cggaacggcc 
agttcgacca 
ccgccctcgc 
cgaccggcgg 
gtgagttgta 
tggacgccgc 
agttgctgaa 
tccggctggt 
agatcgcggc 
ccgcccgtgc 
ccaccgagga 
gcccgacctc 
tcgccgacca 
tgatggatcc 
cgtccctccc 
cggccgagta 
ccctcgaagc 
cgctggctca 
gcggtgagga 
aagactggca 
cgttccagcg 
ctgtggggct 
gtgcgggggt 
atgcggtgat 
ctggtgatga 
tgcctgagcg 
gtcgtggtgt 
atgctgtcgg 
gtgtgtggcc 
cggatgctgg 
gcgaggaact 
cgaccggttt 
cctcccttga 
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aggtcaatcc gccgatggtg^gcctgcgtt gccgttcgcg tgggagggtg 
cgcctcgggt gcgacggctt tgcgcgtgcg gttggcgccg gcgggtgagc 
ggtgaccgcg gtggatccga ccggtgcgcc ggtgatttcc atcgacgcgc 
tcgcctcacc ctcgatgagg tcaacgcatc ccacacccag ctgagcgatg 
cgtccaatgg accacggtcc cgagcacccc ggccgccgac cacccgtcgg 
cggaaccgac cccttcggcc tcgcagacgg cctttcggac gccttgcccc 
gcgcggtgac ctcgcggcgc tcgcagcgtc ggagcacccg gtaccggacc 
cccggtagcg ggcacccggc gcacaggcgt acctgcggac gccgaaggac 
cgggacatcc gacatgctcc gatccgtgcg tgaggccacc gcacaggtac 
ccagcagtgg ttggcggacg accggttcga ggcggcgcgg ctggtgttcg 
ggcggtttcc gtgggtgagg gcggcatcgc cgacctggcg gcctcggccg 
ggtgcggtcg gcgcagtcgg agaatccggg ctgcttcggt cttctcgacc 
cctcgccctt gactccgacc ttgcccccga ggtcgacatc gagcgcgacc 
tccggtcggt gggaccgtgc agcccgcgct cgccgcggcc ctgcacgcga 
gccgcagttg gcactgcgcg gcgggaccgt gcaggccgcc cgactgaccc 
gccgcagacc gaccgtgccg agaccgaccc tgccgagacc gaccgtccgg 
ccggcggccc ggcacggtgc tcatcaccgg tggtaccggt ggcctcggtg 
ccggcacctc gtcgccgagc ggggggtacg gagcctggtg ctcgccagcc 
cgcggccgag ggagcggaga agctggtcgc cgacctcgaa gcgctcggtg 
cgtgcagacg tgtgatgtgg ccgatggcga tgcggtggcg gcgttggtcg 
ggacgagtac ccgctgacgg cggtcgtcca cacggccggt gtgttggacg 
cggctcgctc accgaggagc ggctcgccac cgtcctgcgg cccaaggcgg 
gcatctgcac gaggcgaccc gcgatctgga cctggacgcg ttcgtcgtct 
cgccggcgtc ctcggtggcg ccggtcaggc caactacgcg gcggcgaaca 
cgccttgatg gcgcagcgtc gcgccgccgg gttgccgggt gtgtcgctgg 
gtgggaccgg gccggcggca tgacggggac cctgtcggac gccgaggccg 
ccgctccggt gttccgccga tctcggcgga gcagggcctt gcgctgtacg 
cgccggtgag cggccgctgg tggtgccggt gcggctggac ctcgccgcgc 
cggtgatgtc ccggcgctgc tgcgcggact ggtccggacg cccgcgcggc 
ggccggtgcg gcgccgtcgg ccgatgtgct cacccggcag ttggccgggc 
ggagcaggag gaggtcctgc tgaggctggt gcgcggtcag gccgcggtgg 
cgccgacggc tcggcgatcg gtgcggggcg acagttccag gagttgggct 
gaccgcggtg gagttccgca accgactcaa cgcggccacc ggactgcggc 
cctgctgttc gactacccga cgccggccga cgtcgtcggg cacctgcgcg 
caccggggag gtgtcgggtg cgggctcggt gctggcggcg ctggacaacc 
gatcgccggg ctgtccctcg acgacgcggg ggagcaccag ttggtggccg 
ggtcctcagg gcgaagtggg cggacatgcg aagcgcggag ggagctgtgg 
ggacgtcgac atcgaggagg cgtcggacga cgacatgttc gcgctgctgg 
ggggctgaac tgagccgctc cgcatgagca gttccgcacc agaagttcca 
gtccacagtg acctggtcca cagtgagctg ttccacagcg acctgttccg 
gtagtgagcc gttccgctta ggcgggaaac cggggccccg gtggtcggac 
ggccaccggg agcgtcaaac cggcctcttc taaagaaagg gaaagaacta 
cacagcccga tcatgtgcaa tgaattccgt ggctcggaaa accattcgcg 
ttacggtgtg atcacgtgtc cgtacggcag tgacggcgtg cacgggttgc 
ccccggccga ggcgtctccc cgctgctgcc cggcggtgcg tgccgcttgg 
ggaggtcccc ccatgaccac gtccaccgag gagagcctgt gggcccggtg 
gcaccggccg cccccgtccg gctcttctgt ttcccacatg cgggcggctc 
tacttcccgg tgtcggccca actgtcctcg gttgccgagg tgttcgccat 
gggcgccagg accggcgcaa ggaagccggt gtcagtgacc tcgcgacctt 
gtctacgacg cgctgcgccc cctgctgaag gagcggccga gcacgttctt 
atgggcgcga cgctggcctt cgaggtggcc cggcgcttcg aggccgacga 
gtccggctgt tcgcctccgg gcgccgggcc ccctcccgcg tgcgtgaaga 
cggcggtccg acgacggcat cgtcgaggag ctgaagctgc tcgccggcac 
ctgctcggcg acgaggagat cctgcggatg atcctgcccg cgatccgcag 
gccatcgaga cctaccgctg cccgcccgac gtcaccgtcc gggcgccgct 
accggcgacc gcgacccgaa gacctccctg gacgaggccg aggcgtggcg 
accggggact tcgacctcaa ggtgcttccc ggtgggcact tcttcgtcag 
ccggcgatca tcgatctgct ccgggcgcac ctcgccggca acggctagcg 
ggcaggccgg cggtgccggt ctgccgcacg gcccccgcgc cgcctgagac 
cacccggcga cgcgcgttgc gtgcgcctgt ggagcaccgt cccgcgtgcg 



teTCcctctt 
atgcggtgtc 
ttcgtacccg 
cgctcttcgg 
tagccatcat 
tggtcgagga 
tcgtcctcgt 
acaccgacgc 
tggagcagat 
tgacgcgcgg 
tctggggtct 
tcgacctcga 
gtgaccgcga 
ccgccgacga 
gaatccccgc 
agatcgacac 
ggttgctcgc 
ggagcggtct 
ccgtggtggc 
ccggcgtgtc 
acggagtgat 
atgccgcctg 
tctcctccct 
cgttcctgga 
cgtggggtcc 
accgcctcgc 
acgcggcgac 
tccgcgggct 
ggaccgcggc 
tcggcggggc 
tgctcgggca 
tcgactcgct 
tgccggccac 
gccggctcgg 
ttgaggcggt 
gccggctgga 
acggcggtgc 
acgacgagct 
cagtggcctg 
cggcggttcc 
gcttgattcc 
gggaaacacc 
ggccatggag 
cacggcaggt 
ttcgtcccta 
cttccatcca 
ggcctccttc 
ccagtacccg 
ggccgaccag 
cgggcacagc 
cggtgacctg 
ggccgtgcac 
caacaccgcg 
cgactaccag 
gaccgtcctc 
cggccacacc 
ctccgaggcc 
ggcgcactgc 
ggcaccatgc 
tacgcgggcg 
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110220 
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110400 
110460 
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110580 
110640 
110700 
110760 
110820 
110880 
110940 
111000 
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cgcggaggac 
tgtctgacag 
gtacggagcg 
gcaccggacg 
acgtggagga 
tgggacgcga 
gggccggcct 
ccttcacggc 
ccgtcgccgg 
gcggcgacgc 
tcaccgcaga 
tgtactggct 
tccactggtg 
acctgccgct 
tgctcgccga 
ccgacgacat 
tcagccgggt 
aactccgcgc 
gcagtcacgt 
gcggcgtggc 
tcgccggcgt 
tcctggccgc 
acgtcgcccc 
ccggccgccc 
cgtggatggc 
aggccggtgt 
gcatccagat 
aggaagcgct 
gcttcacctg 
tcgccgagct 
ccctcgtgga 
tccgcgaccg 
tgctggccat 
accaggcccg 
ccgcctcctt 
tcatgctcca 
cgcgcgccct 
ccgccgtcga 
cgctggccac 
gcatcacccg 
gcgcctacgc 
gcggtcggtc 
atggcgcggt 
acggaagcga 
cccagggcgt 
tgctcgcgga 
cccacctgaa 
tcgcccagcg 
cgggtggtcg 
gcacggtggc 
tcgtgactgt 
gcgggcgtgc 
cgccggcctg 
ggtgccccgg 
cgcccacggc 
cctcctgcac 
gcacgcccag 
cgtcctgcga 
actgctggcc 
cggtcccccg 



gtgctcggcc 
ctgctgctca 
tgccactcgt 
gccgtggtcc 
acgcgtgatg 
cgacgaactg 
ggtcctgctc 
gagcgatgtc 
cgccgggtac 
ccggcgctcg 
ccccgccggt 
cgccgcccgc 
cgacgaacgc 
gctggtcgtc 
catcgccgcc 
aggcgaaatg 
cgccgccgtg 
cgagggcgtc 
cctcgcccgc 
ccgtgccatc 
cccggccgcg 
cgaccgcgtg 
gcccaccctg 
ctccgaggag 
cgcggtgctg 
gcgctgcctc 
ggcccgcgcg 
ctccctcgcc 
cctcgccgtg 
gaccgccgaa 
gtccgtgctg 
ggcggcccgg 
gaccaccgtg 
ccgcgccctg 
cgccctctcc 
gtacggccag 
gctccaccac 
gatcctcggc 
cgccctcgtc 
cccccgcctc 
ccgctgggtc 
cctggaggag 
gctcctggcg 
gttggccgag 
cgccgcaccc 
ctccccggcc 
gcgcgacgac 
ctgcggcgcc 
ggtacgcagg 
ggacctcgcg 
aaggaccatc 
ggagctgtcc 
ggtctcccag 
agcaaggcgc 
gaccccacca 
cgcctcggcc 
aacgccctgg 
gcccaggcga 
gtcctcgacg 
ccactgcccg 




ggccggatgt 
tttactccag 
tcgctcctgg 
tggtacgcgt 
cggaagcagt 
cgcaccctcg 
cacgggcccg 
tgccgcggca 
ggcggtgtgc 
cccctcctgg 
cccgacgccg 
ctcatggccc 
tccctggcct 
ctggcctggc 
cagcgccgcc 
gtgcgtcgcg 
tccggcggca 
cggccggacg 
tcggtgcgct 
gccgtactcg 
accgtcgacg 
gacttcgtcc 
gccgaactgc 
ctcgccggcc 
cgcgacgccg 
taccgggtgt 
Gtcgccgaga 
ggggacgtcc 
caggaatccc 
ctgggccccg 
ctcatcgtcg 
ctcaccatgc 
ctgaccgcga 
cgcgcacccg 
ctggccgacg 
gacaacgcgg 

ggggtgggcg 

gaggagcgct 
gaccgcggcg 
gaacgcttcg 
cgcggggatt 
tcgcgcttca 
accctggacc 
cgttggggga 
ggccgcgccg 
cgggccatgg 
ctgcgggccg 
gtgaagctcg 
atgaccgcct 
gtgaccggcg 
gaaacccatc 
gccgtcctgg 
gcacgcggac 
ggaaccaacc 
tgctcctcga 
aggggcggcc 
tccgctgggg 
cgcccgcgga 
gcccgcacgg 
tgcccggcat 



cctggaggag 
cccgcactat 
tgcatccatc 
cggcgcgagg 
cgggttcttc 
cccggcacgc 
ccggcatggg 
tgacggtgct 
gcgaactcct 
igggcctggc 
ccacgggtgc 
aacggccgct 
ggatcgactt 
gcagcgaggc 
ccaccgtgct 
tcttccggac 
atccgctggc 
ccgccgggga 
gcctgttgga 
gcccggagtg 
aggccctgtt 
atgacgtcgt 
gcaccaacgc 
agctcatgct 
ccgcccaggc 
tggaggtgga 
tcaacccgcc 
gcacccgcgc 
cgtccggggt 
aaccagggcc 
gggccgacga 
cgcccggcga 
tggacggccg 
gcgtcgagct 
aggtcgccga 
cggtgtggac 
ccttccccga 
gggcggacgg 
agcccgagcg 
tcatcgaata 
tccaaggagc 
gcaacccggc 
gccacgacca 
cggcgcgcgg 
gcatcgatca 
aggcccgggc 
cccgggaaca 
gcgtcgacgc 
ccccactcga 
cgagcaaccg 
tcacgagcgt 
agaccaggac 
gcgcttgaga 
gaccacctgc 
atgcggccgg 
atcggtgctc 
cgcgtgccgg 
acgggaactc 
cagcaccctg 
cgaggaggtg 



accattcgga 
atgatcttgt 
ccgaccgtcc 
tcggctcctg 
cggcttactg 
cgcggccgcc 
caagaccagt 
gtacggcacc 
cggcgggctc 
ggcccgcgcg 
ctacccggtg 
ggtcctcgtc 
cctgctgcga 
cgaaccggtc 
cggcctgcac 
cacggccgca 
cctcgcccgc 
gcgccgggcc 
gcgccggccg 
caccgagttg 
ggtgctgcgc 
ccgctccgcg 
cgcgctgttg 
gctgccggtg 
ggagagccgc 
gccggacaac 
cgaggcgatg 
ccaggtcgcc 
gcggatgctg 
cgtggaccgg 
gaaggtgacg 
cacaccggcc 
ggacgcccgg 
ggaaccctgg 
cgcgcagtac 
gtacgtcctg 
ggccctggcc 
cgccgtgctg 
cgccgaacac 
ccactggtac 
cctggacctc 
gttcgtgccg 
ggcgcgcgaa 
cctcggactg 
cctcaccgag 
cgaaicttctt 
cctgcgcgcc 
cagaaaactg 
catgctgacc 
ggccatcgcg 
ctaccggaag 
cgccacctcc 
cgaacaggac 
acgccgcagt 
gaacagcggc 
agcctgaccg 
gccaggcacg 
cgctacggcg 
gacgccgcga 
ctgcggcgca 



aaacgl 
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aaacgtcaaa 
ggtggggcat 
ggatttgtgt 
ttctcacccc 
accacgctgg 
cgcgacgggc 
ctgctgcggt 
tgcggcgaga 
ggcctgagcg 
ctgcccgcac 
ctgcacggcc 
ctcgacgacg 
cgcgccgagg 
gcgcccgcgg 
ccgctcggcc 
ccgtcgttcg 
ctcctcgacg 
gccgaggtcg 
ccctgggtgc 
ctggcggcgc 
agggccggca 
gtgctcgacg 
ctgagcgacg 
ctcgaccagc 
ggcgccccgg 
gttgccgtcc 
cgcctcctca 
gtccagtacg 
gaggacgcgc 
gagttgcgga 
atcggtgcgg 
cagcggcaga 
tcggccgtcg 
tcgctgctgt 
gcactggacc 
gcgctctcca 
gacgcccaga 
ccccgtgtcg 
gtcctcgacg 
ctccaggccc 
cttctcgcct 
tggtgggccg 
ctcgccgcat 
gccttcatgg 
gcggtctcgc 
ctcggacacg 
gccgccgacc 
ctggtcaccg 
gggatggaac 
gaagccctct 
ctcggggtcg 
ggtcggcagc 
gagaggtagc 
gcgcacccga 
tcatcggcga 
gccggcccgg 
acgggctgcg 
ccgttctcca 
tccgccacga 
ccggcacggc 



111060 
111120 
111180 
111240 
111300 
111360 
111420 
111480 
111540 
111600 
111660 
111720 
111780 
111840 
111900 
111960 
112020 
112080 
112140 
112200 
112260 
112320 
112380 
112440 
112500 
112560 
112620 
112680 
112740 
112800 
112860 
112920 
112980 
113040 
113100 
113160 
113220 
113280 
113340 
113400 
113460 
113520 
113580 
113640 
113700 
113760 
113820 
113880 
113940 
114000 
114060 
114120 
114180 
114240 
114300 
114360 
114420 
114480 
114540 
114600 
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acccaccctg 
gcagatcctg 
cggtgacacc 
caccgtgccc 
cgtccgcgcc 
caccgccggc 
ccccgccgac 
ccacaccgtg 
ggccgtctgc 
gctgtccgag 
cgacaaggtg 
cgcggagcgc 
cgacgaggac 
gcccctgctc 
tgcctgcctc 
gttggaactg 
ggaactcgtc 
ggtccgcgcc 
gggcgaggcc 
gttggccgcc 
ccggccggtg 
ggacgcggac 
cctgatgatg. 
ggcggtgcac 
ggccgcccgc 
cgccgaacgc 
tgcgctgccc 
ccgggcccgc 
gcccgccctg 
cctgcggctg 
catgctcagt 
ggcgcgccgg 
ccgcgggatc 
ccggcggatc 
gtggagccgg 
acgctcctgg 
cgccgcccgc 
cgtcccaccc 
ccgcggccac 
gctccggctg 
gctcctggag 
cccggatccg 
acggtcccct 
acggcacccg 
tcgcccgcca 
gcaccgccgg 
ccggccccgc 
acgcgtccgc 
cgtccctgcg 
tgtgcacgct 
gcgccgcgca 
tcgccgggcc 
ccaggggcaa 
cgcccggcgg 
ggctcgcggg 
ccgcgctggg 
tcggtttcgc 
acgtccgctt 
agcgggaacg 
agcaggtcgc 



gtcgtggtcg 
ctgcgccacc 
acggccttcg 
gtcgcgcgct 
gtctgcggca 
aaccccgcca 
gccgaccacc 
cgcgccctgg 
ggcgacctgc 
gaccggatcc 
cacatccgct 
gccgatctgt 
gtcgcccatc 
cgccgcggat 
tcccgcgccc 
gccgcggccg 
cgcagcaccg 
atcgacctgg 
ctgccgtacg 
gtgcgggacg 
ccgccggccc 
aaggccagga 
ccgaaactgg 
ggcctggaca 
attttcaacc 
gatctggaca 
aacctgatcg 
cgactcgccg 
ctgctcgccc 
tcgcgggagt 
tggcgcccgc 
ctgcgggacg 
gcccgcctga 
cgcgaggccg 
ctgagccagg 
caggcggtcg 
accctgaccg 
ggatggcgcg 
ggcaaccgcc 
agcaacgcct 
gcgctggaag 
ggccgccctg 
cggcagcggg 
cgtcctgcgg 
actcttcgac 
ggcagagcac 
cctggaggtc 
ggagcggcgc 
ctggctggcc 
ggccgacggc 
caccgtcctg 
ccagggccgg 
cccgctgttc 
cgaccacttc 
ccatctgcgg 
cgaccacagc 
cggtgcccgc 
cgtccacggc 
ctcgcacgac 
cggccatctg 




aagacgtcca 
tcgggccgga 
acaccgaccc 
tcgtggtgcc 
cccccggcga 
tcctgcggga 
tcccggagct 
acggcctgcc 
tcgacttcca 
gcaccctgct 
tccccgcctc 
acgtccgcgc 
tgctgctgcg 
tcgccgccgc 
tgcaggaacc 
aagccgtcgc 
tcgcggacac 
ggttcgcccg 
ccgggccggc 
acgacgcgcc 
aggccggcgc 
agctcgcccg 
cggcctgcgc 
ccatgctcac 
tacgggcccg 
gcgccgagcg 
ccacccgcat 
aggccccggt 
gcgcccgtgt 
gcgggcgctg 
tggccgccga 
aggagctgtt 
cgacgcggcg 
ccgccctgct 
ccggtgccga 
cccggatgac 
tcccgtccgt 
acctgtccga 
agatcgccga 
accgcaagct 
gaccggtcgc 
gacgccgcgg 
cgttcggcgc 
gccagcgccg 
cacctgctgt 
ttcagccgac 
tcccaggcag 
ctgctgatcc 
cacctcaccc 



gaccaccggg 
cgcctggcgc 
ccgccgcagg 
ctgaccgcct 
ggcgccgtcc 
atccagccgc 
gatccggtgc 
cgcgcgctgg 
gtcgtccgcg 
gacgccgccg 
ctggccgtgg 



gtggttggac 
caccccgctc 
gaaggccccc 
cgcgctcacc 
cgaggagttc 
cgccctgcgc 
gcacgccctc 
cgccgaagtc 
ccgagtccgg 
ggcgagcgtc 
caaggcacgc 
ggccgaactc 
ctcgtcgccg 
gctgcgccgg 
cctcgacccc 
ccggccggag 
cgaccccacg 
gggcaacagc 
cgaccgggag 
gatgatcccc 
ccgtgcctgg 
gatcgccctc 
cgcgctgttc 
cgccgcccgc 
gatacacctg 
cgccctgccg 
cctcgtcagc 
ccccgccggc 
ggccgccgac 
gctttttcgc 
ggcgtgtctg 
cttcgccgac 
actcttcgac 
ccgcgactcg 
gacggcccac 
cgccgcccac 
tccggtcgcc 
ggcggagaag 
acaactcgcc 
gaggatcggc 
ggatgcttct 
aagcgggcga 
tgctgcgccg 
cctggcggga 
ccggggcggg 
tgatggacac 
tgctccaggg 
.tggtcgacga 
ggcggctgca 
gcaggtaccc 
cgctgtcccg 
acgcactggt 
tccggagcgc 
gggagctgag 
agccggtgcg 
tgctcgccca 
tggacgccgg 
atgcggtgga 
atctgctgta 
tccacccggg 



ccggcctcgc 
gccgtcctgg 
gccgtcccgg 
gaccgcgggg 
atcgccgcgc 
gccttcgtcg 
accgctggcg 
aacgccgtcc 
gccctcgccg 
ggcctgaccg 
gtcatcgagg 
acccacagtt 
ctcggcgcac 
gaggaccacc 
cgggaacgca 
gcgggggatc 
tcgtccggtg 
gaatgggtcc 
gaactggtcg 
gtggtgcccc 
cagctggcca 
accggcgggg 
gccaccgacg 
agtgcccacc 
tgcgcggccc 
ccgacgagtt 
atggagacgg 
ggcgaggagg 
gacggtgact 
cggcactggg 
aagctcggcg 
cgctggggca 
gacgacggcg 
cccgcccgcc 
ggcgacaccg 
cccgccagcc 
accgcgccgc 
gacaccgtgc 
gtcagcaggc 
ggacgcaagg 
tgagcgggag 
ctcctcgctc 
gataccggag 
acgcgacttc 
cggcgcaggg 
cggcgaccgc 
cgcccaggcg 
cctccagtgg 
cggcctgcgg 
cctggtccgg 
ggacgccacc 
gcgcgccgtg 
tctgcgcgcc 
cccgacggtg 
cgaggtcgcg 
gctcgccggg 
cctgttggcc 
ctccctgctc 
ccgctgcggg 
ccggccctgg 



t^fffgtggtt 
ccagcagctg 
ggccgccgga 
tcgccgccac 
tcacctccgc 
accacggcct 
tcgtcggcga 
tgcgggccct 
gcgcgcactc 
tgtccgtcgg 
acatgcccgc 
gcggcgtcaa 
cctgggtcgt 
accgggcctg 
gcctgctgac 
gacgcctggg 
agggggtggg 
gccgcaccgc 
cgctgttctg 
ggttgcccga 
cggcggggga 
tgaacgagag 
acaacgacga 
tgcgcagcat 
ggctggaggc 
ggcacccccg 
gccgcccgga 
gtgtgtggtg 
gggaggaggc 
ccaacccggc 
acgtgacgga 
ccgcgagcgc 
accgggccgt 
tggcctacct 
ccgcggccgc 
gcctcgccac 
ccaccgccgt 
tgctcgccgc 
gcaccgtgga 
agctgtacct 
aacgaactgg 
ctcctgatca 
ctggccggcg 
cccttcggga 
ccggccgaac 
cctaccggga 
ctgctcgccg 
gccgacggcc 
gcgctgctgg 
gaggtcgccg 
cgcgtcctgc 
tacgaggcgt 
accggaaggc 
ctgcgcgatc 
gtggcggtgg 
gtcgatgaaa 
cggggacggg 
accctcgacg 
cggccggccg 
tcggaggcgg 



114660 
114720 
114780 
114840 
114900 
114960 
115020 
115080 
115140 
115200 
115260 
115320 
115380 
115440 
115500 
115560 
115620 
115680 
115740 
115800 
115860 
115920 
• 115980 
116040 
116100 
116160 
116220 : 
116280 
116340 
116400 
116460 
116520 
116580 
116640 
116700 
116760 
116820 
116880 
116940 
117000 
117060 
117120 
117180 
117240 
117300 
117360 
117420 
117480 
117540 
117600 
117660 
117720. 
117780 
117840 
117900 
117960 
118020 
118080 
118140 
118200 
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tcctgcgctc 

acctgcgccg 

tcgatctggc 

aggcggtcgc 

cgtccctgct 

ggctcgacga 

aggcgtggct 

ggctgcggcg 

ccgtgctgtt 

ccggcaaccg 

cgctggtgat 

ccagcgaaca 

ccgagcgggc 

aacgcgccct 

cggtcgcctt 

accgccggcc 

acgtgcactt 

tggagaccgt 

tgcaccagcg 

gggcccggga 

tgttgcagga 

cctacgcgac 

cggaggccga 

ggctcgccga 

tgacccccag 

tcgcgaccga 

gcaagctcgg 

cgcggccgtc 

cgtcgttgtc 

gcgggggcag 

agttgggact 

atgtgtcgcg 

ctcttcggcg 

taaaagacga 

gactcgaatc 

tccgtcggcc 

caggagttcc 

gacctgatgc 

ggcaagcggc 

gcgggcaccc 

gtgctcatgg 

aagaagttcc 

accattccgc 

gggctgctgc 

atgggcatcc 

cgcccggccg 

tgtcgccgac 

gtcgatgttt 

acttttcgtt 

gtccgcggtg 

ccgaatactc 

cgtgcctcgt 

cggcacgaca 

ctgctcgatg 

ccgcccgttc 

ccaccacggc 

tgtggcccgg 

ggagctgatc 

cagcaaggaa 

gcggcacggt 




cgcggcccac 

cgccctgctg 

caccgccgag 

gctgctggac 

cgccgccccc 

accggggcag 

gcggcactcc 

catgggggca 

gagtgcgggc 

catcctcgaa 

gctctcgctg 

gcacacccgg 

cttcgtcctg 

ggtcatggac 

cgaactgcgc 

ggccggactg 

cggccggggg 

gggatggcgc 

gctcggcgag 

gtggggtgcg 

cgagggactg 

ggaactggcg 

ggcggtgttg 

acgggccgaa 

cgagcggcga 

actcggtgtg 

cgtctccggc 

ccgttccgga 

gtacgcccgt 

gtgcatccac 

ctgcggaccg 

ttgacgtgcc 

tgaatcgcgg 

cgggggttat 

accgtacgac 

ccgggctgat 

gccgccgctt 

acccgagcgt 

accgcttcgc 

tcaccgcctc 

actcgtccgg 

tgaccgagat 

tcgcctcgcg 

ggaagttgcg 

tgaacgtcgg 

gcctgtcccg 

gcccgcctct 

atccattctc 

ccgtcgtaac 

tgccacctgg 

acggtaaccc 

gcggcgtggc 

gagcgggggt 

ttcgtgccac 

ggtggcgaat 

cggcgcagcc 

acacgcagga 

cggcgcagcg 

acggtccgcc 

ggcgcctatc 



aacgcgctgc 
caccaccgca 
cgcgccctcg 
acctcgcggg 
agcccgtccg 
cgggacgagg 
ggccacgaga 
cggccgccgg 
gcgctcagcg 
cgtgagccgg 
ttcgtcgccg 
cgccggtacg 
gtgacgcagg 
gccggcgact 
gacccggcct 
gcgctcaccg 
cgggacgccc 
aattccgcac 
accgatgccg 
acgacgaacc 
gatctgctgc 
cgcaccctcg 
cgggaggccg 
ctcggtctgg 
gtggcgtcgc 
agctcccggg 
cgccgcgagt 
cgatcacaat 
ccggtcatga 
cccatcacca 
tacctcgtca 
gagttgtggc 
cgaacttcga 
ccgaaaggat 
cggcggggtg 
ggcgtccctc 
cgacgattcc 
gcagcagccg 
ctcgcacgtg 
cgcggtcacc 
cgccgccgac 
cgatgcgcgc 
gctctacctc 
ggtgcccaac 
cacctggccc 
caccggccgg 
cacggggcga 
acgttacgaa 
caaactgccc 
gccccactcc 
ccgttcctcc 
tcgcccgctt 
ggtcgacccg 
cccgaactct 
ccgccgcccg 
gaggcttgtt 
ggaacaccgt 
gatccgccga 
gggacctgaa 
ccatggtgcg 



gcgccggccg 
cccaggacgg 
accccgatgc 
atcgggccgc 
ccgtcgagtt 
agggagccga 
accccgtcga 
tggacagcgt 
gccggctcag 
ccaccgccgc 
agtccgtgca 
cgaccggcgc 
gccgcccggc 
ggtcggaacc 
tgagcgaacg 
ccaccggtca 
tggacacgct 
tgctgccctg 
cgctgcaact 
tgggccgggc 
gtgagagcgt 
tcgtcctcgg 
cggggatcgc 
gcagtgccat 
tggtgagccg 
cggtggagaa 
tggtgaatgc 
cctctgtgcc 
cagcaatcct 
atgtctcgta 
agtacggacc 
taactacgct 
tcgaattctg 
ggcggtctcg 
atatccgctc 
gaccgtgacc 
gccggtgacg 
ctgatgcgcc 
gtcgccgtgg 
ggcaagaccc 
gccgccgatg 
atcctcgaag 
agccgccagg 
cgcgccgcgc 
ccgaaggtcg 
ccaccgtcgg 
tgcccggcgg 
cccgcaatgc 
gtcatatcgt 
gacgcgcccg 
tggtggagtt 
tccccgccgc 
tccgcggtgc 
gtgttcttgg 
ttcctggcca 
ggaatcaccc 
ggatgctgag 
tgtcgtgcgt 
cgtcctggag 
gccgggctcc 



gcccgccgac 
ctgccgcgcc 
ctgtgtacgc 
cgccgtgttg 
ggtgcggcag 
cgaactcgcc 
gctggcgtcc 
cgccgaacgc 
cgccgcggag 
ccatgcccac 
gggcgtggcc 
ggacgatgtg 
cgccgcgcgg 
cgccgtcatg 
catcctggaa 
gatgctccag 
gctggcctgt 
gcgtccgtat 
cgccgaggac 
cctgcgcctg 
cgagatcctc 
gcgtcggctg 
cgccgcctgt 
cgtgccgccg 
gggcctgacc 
gcacctcacc 
gctcccgggt 
tgtcagccct 
caacaaaagc 
tgttactttt 
acggatttag 
ccgtttcacg 
gattccgttt 
tgactatcac 
agaccgcacc 
tcaccatcaa 
tctgcggtcg 
agttctcccg 
gcgcccagga 
cggacatcgc 
ccggcgtcgt 
gcatcgcggc 
gcgtggagta 
tcgtctcgcg 
tggacgactt 
accgctgtcc 
cacgggcgtt 
tcatgcgcgc 
ttccactgac 
cgacgggtcg 
ccccgcgcgt 
cgcgacagcc 
ggcgcagtca 
tgaccgtttt 
caacggatcg 
ccacccagtc 
ggccgccgca 
ctcgccgagg 
ggccatgggc 
gaggccgtct 



f 

gcggc 



gcggcccggt 
cgcatcctgg 
cacgtcagcc 
cgcatcccgc 
gccgccgccg 
ctgcgcctgg 
tcggtggcgc 
gagttggtcg 
atcgccgaca 
accccgctgc 
tcctggctgg 
ctgctgaccg 
gagcacgtcg 
atgttcgccg 
cggatccgcg 
gccgccgtcg 
ggccgacgcc 
gcaatcgggc 
gagctgaggt 
aagggctggc 
cgcgcttcgt 
ccgggtgggc 
ggtgttccct 
gtcgccaccc 
aaccaggcca 
agcgcctatc 
cgttgacgcc 
gtgggtgcgg 
tcaaacggaa 
ggcccaccgt 
gggtgcgtag 
gaaatcgaaa 
ctcagaaggc 
tcacctcact 
ggccggggag 
gcacgccaac 
cagcttccgg 
gctgatcgag 
cgcggccttc 
cgggatcctg 
gaccagccag 
cgggctgtcg 
ccacgtcacc 
cgcctactcc 
catcaagtga 
cgcacgcccg 
tcgtgccctg 
gattccggcc 
cgtttctgtg 
cgagcggcct 
cacattcctc 
attgttacaa 
agcggaacct 
caccaatgcc 
cttcggccgc 

ggggcgccgt 
gggacatgct 
agttcgccgt 
tgatacggcg 
tcgtctcccg 
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118260 

118320 

118380 

118440 

118500 

118560 

118620 

118680 

118740 

118800 

118860 

118920 

118980 

119040 

119100 

119160 

119220 

119280 

119340 

119400 

119460 

119520 

119580 

119640 

119700 

119760 

119820 - 

119880 

119940 

120000 

120060 

120120 

120180 

120240 

120300 

120360 

120420 

120480 

120540 

120600 

120660 

120720 

120780 

120840 

120900 

120960 

121020 

121080 

121140 

121200 

121260 

121320 

121380 

121440 

121500 

121560 

121620 

121680 

121740 

121800 
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gaccgcacag ccgatcccgg aggagtcccg gatcgccacc gcggccgccg aactcctcag 121860 

cgaggccgag acggtcttca tcgacgaggg cttcaccccg caactcatcg ccgacgccct 12192 0 

gccgcgcgac cggccgctga ccatcgtcac cgcctcgctc ccggtggtca gcgccttcgc 121980 

gacgagccca caggccaacg tgctgctcct gggcggccgg gtccgccggg gcacgacggc 122 04 0 

caccgtcgac cactgggccg tccacatgct gtccggcttc gtcatcgacc tggccttcct 122100 

cggcgcggag gggatctcgc gcaggtacgg cctgaccacc cccgacccgg cggtcgccga 122160 

ggtcaaggcc caggccatcc gcgtcgcgcg ccgcccggtc ctcgccgggg tgcacaccaa 122220 

gttcggcacg gcgagcttct gccggttcgg agaggtgggc gacctggaga cgatcgtcac 122280 

cggcgccggc ctgcccgtcg ccgaggccca ccgctaccac ctcatgggcc ccaaggtttt 122340 

acgggtgtga cgccgccggg gcgtcccgcc gcccgcccac ggcctggcgc gcgccggtgc 122400 

gggtcagccg cgtcgcgcgt ccgtcggttc ccgccagagt gcgagcggtg tcacccccac 122460 

ccggtagccc tcctcgtcga ggagatggcc gccgtgcttc atgtgccaga gggcgaccgc 122 52 0 

gcgttcggtg acggtcaggt ccgcggcccg ttcggcgagt cggacctcga agcgctgggc 122580 

gtcgtcgagg gagcgcaccc aggccaccag gtgcaggttg tgccggctga tgacgctcgc 122640 

gcacaggcgg acctcgcgca tcccggtgac ccggcgggtc acctcgcgca gcctcgcggc 122700 

cgggacctgc ccccagaagc tcaccgtgac cggccactcg gacagcgggc gggcgacctc 122760 

gcaccgggcg tgcaacatgt ccgcggcgaa gagccgttgg acgcgacggc gcacggtgtc 122820 

ggggccggcg ccgcactgct cggccagcgc gcggtaggtg gcccggccgt ccacggacag 122880 

cgcggtgacc agctgttggt cgagctcgtc gaggacgaag gcgggggtgt ccgt'gcggtg 122 94 0 

tctggacgcg tcggcggcga gccgggcgac ctggtgccgg ccgagggccc gcaggcgcca 123 000 

ccggctgccc tcggtgtgca ccgggccggc caggtgcgtc cgggcggccc ggacgccgtc 123060 

caacgcggcg aggtcgtggg tcacccaacg ggagagcatg gccggatcgc gggccatgac 123120 

gttgagttgg aggtcacggt ccccggtgac gtgggagagc gccaccacgt gcggcgcggc 123180 

ggccaacgcc cgcgcgacgt cgagcagtcg gccgggagcg cagtcgatct cgacgaacgc 123240 

caggcacccc tggcccgacg cggccagcac cggggccgga tggcagctga tccaggcggc 123300 

gccggtctcg accagccggt tccagcgcct ggccaccgtc acggcgtcca gtccgaggac 123360 

ggagccgatc cgggtccagc tggcccgcgg cgtgatctgc agcgcgtgca ccagggcctg 123420 

atccacatgg tccagggacc ggggggtctg cccggaatcc tgcgccacgg gccacctcct 123480 

tgcgtgtttc cggcggattc gggccgccgg tcggctcaac cttcagcctg gactcgggta 123540 

cggccggacc gtaccaggca acccccggag caacaggagt 123580 
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